====== cs19s23as05: Rhyming Dictionary ====== ===== Goals ===== * Practice with leveraging STL types. * Work with an interesting dataset. * Write a program with time-efficiency in mind. ----- ====== Prerequisites ====== This assignment requires familiarity with the [[:lecture materials/]] presented in class through [[:lecture materials/week 05]]. ----- ====== Background ====== ===== CMUdict ===== {{:sphinx-plate.png?nolink|Image of the sphinx-plate logo of CMU's "Sphinx knowledge base tools": http://www.speech.cs.cmu.edu/tools/}} > [[http://www.speech.cs.cmu.edu/cgi-bin/cmudict|The Carnegie Mellon University Pronouncing Dictionary]] is an open-source machine-readable pronunciation dictionary for North American English that contains over 134,000 words and their pronunciations. We have the [[https://github.com/cmusphinx/cmudict|CMUdict dataset]] available on our server, most importantly in file ''/srv/datasets/cmudict/cmudict.dict'' (also available [[http://jeff.cis.cabrillo.edu/datasets/cmudict/cmudict.dict|via HTTP]]). The format of file ''cmudict.dict'' is as follows: Lines consist of a word followed by its pronunciation as a sequence of [[https://en.wikipedia.org/wiki/Phoneme|phonemes]] in [[https://en.wikipedia.org/wiki/ARPABET|ARPAbet]] format, e.g.: dinosaur D AY1 N AH0 S AO2 R superstore S UW1 P ER0 S T AO2 R Words with multiple pronunciations are suffixed with (n), e.g.: subject S AH0 B JH EH1 K T subject(2) S AH1 B JH IH0 K T Phonemes that indicate a syllable have integer suffixes: * Unstressed syllables: ''0'' * Primary stressed syllables: ''1'' * Secondary stressed syllables: ''2'' ----- ====== Assignment ====== You shall write a C++ program that acts as a **rhyming dictionary**. The program shall: - Expect a one-word command-line argument that will serve as a case-insensitive **query**. - Consider //rhymes// to be those words that share identical phoneme patterns from the final stressed phoneme to the end of the word. - Print, in lowercase and in alphabetical order, all words from the [[#CMUdict]] dataset that rhyme with the query, excluding the query itself. - By default, print only those rhymes that also have the same number of syllables as the query. - Otherwise, if a second command-line exists and is the string ''-a'', print **all** words that rhyme with the query, regardless of syllable count. Make sure to choose your data types and logic carefully! Review the materials from week 05 in which we discussed the strengths and weaknesses of the different STL containers. Your program will be expected to terminate in **no more than five times the running time** of the sample executable, and consume **no more than ten times its amount of memory**, when compiled with ''-Ofast''. ===== Sample Executable ===== Sample executable ''cs19_cmudict_rhymes'' exists on the server and demonstrates the expected behavior of your program: $ cs19_cmudict_rhymes rhyme beim chime climb crime dime grime haim heim hime i'm kime lime lyme mime prime rime seim sime slime syme thyme time $ cs19_cmudict_rhymes rhyme -a all-time anticrime beim chime climb clothestime crime dime grime haim heim hime i'm kime lime lyme mime one-time onetime part-time prime rime seim sime slime sublime syme thyme time $ cs19_cmudict_rhymes putin shootin' $ cs19_cmudict_rhymes putin -a highfalutin rasputin shootin' ===== Leaderboard ===== As submissions are received, this leaderboard will be updated with the top-performing fully/nearly functional solutions, with regard to execution speed.
RankTest Time (s)Memory Usage (kB)SLOC (lines)User
----- ====== Submission ====== Submit your source-code file(s) via [[info:turnin]]. {{https://jeff.cis.cabrillo.edu/images/feedback-robot.png?nolink }} //**Feedback Robot**// This project has a feedback robot that will run some tests on your submission and provide you with a feedback report via email within roughly one minute. Please read the feedback carefully. ====== Due Date and Point Value ====== Due at 23:59:59 on the date listed on the [[:syllabus|syllabus]]. ''Assignment 05'' is worth 60 points. Possible point values per category: --------------------------------------- Correct output... 40 ...under memory threshold 20 Possible deductions: Exceeding time threshold 100% Style and practices 10–20% Possible extra credit: Submission via Git 5% ---------------------------------------