About this algo:  
 
         I created this algo using an NLP tokenizer and a Term Frequency/Inverse Document Frequency model to convert song lyrics into a matrix of word frequencies. I then use a cosine similarity score to calculate similarities between a song and other such songs. 
        The songs with the highest cosine similarity wins! (And that's the result you see).  
 
 
        Use cases of the underlying principle:  
        - Grant matching: Government agencies and NGOs can use this principle to identify their agency offering v/s grants that startups can apply to. 
        - HR Recommender System: Matching the best candidates to an ideal job profile using descriptions of the two  
        - Create your own Search Engine: This is also used in matching search queries to results  
       
         
            Dataset: What does the underlying data look like?:  
             - The dataset consisted of 57,000 songs, its lyrics, its artists and its genres.  
             - The underlying text within it was used to check word/text frequency across other songs/lyrics. This was converted into a matrix, and a similarity correlation score was attached to it to give you the final result.   
             - Data Training: The whole dataset was used to give you an answer and there was no split between training and testing data as such.