====== cs20ps23as04: Scrabble® Generators ====== ===== Goals ===== * Practice with generator functions, generator expressions, and functional programming in general. * Implement part of the logic for a NPC in a well-known board game. ----- ====== Prerequisites ====== This assignment requires familiarity with the [[:lecture materials/]] presented in class through [[:lecture materials/week 04]]. ----- ====== Background ====== ===== Scrabble® Datasets =====
If you've ever played [[https://en.wikipedia.org/wiki/Scrabble|Scrabble®]], you may know that there are official Scrabble® dictionaries containing lists of legal words. We have the equivalent in file ''/srv/datasets/scrabble-hybrid'' (also available [[http://jeff.cis.cabrillo.edu/datasets/scrabble-hybrid|via HTTP]]), which contains 275381 legal Scrabble® words, one word per line. Here are the first five lines: LARRIGAN KUNE SINGLETON ACHENIUMS LOCAL Words in this file shall constitute the entirety of //legal// Scrabble® words for the purposes of this assignment. Additionally, file ''[[http://jeff.cis.cabrillo.edu/datasets/scrabble-letter-values|/srv/datasets/scrabble-letter-values]]'' describes the value of each letter tile in Scrabble®. ===== Using a regex to extract potential Scrabble® words ===== In order to take full advantage of the space and time efficiency on this one, consider using ''[[https://docs.python.org/3/library/re.html#re.finditer|re.finditer()]]'' to create an iterator that will yield all potential scrabble words in a string. A regular expression matching all contiguous sequences of uppercase English letters is simple enough: [A-Z]+ You might consider combining this with ''[[https://docs.python.org/3/library/itertools.html#itertools.chain.from_iterable|itertools.chain.from_iterable()]]'' to "flatten" the results of iterating a file object: >>> import itertools, re >>> sum(1 for _ in itertools.chain.from_iterable( ... re.finditer('[A-Z]+', line.upper()) for line in open('/srv/datasets/party.txt'))) 20 Note you could verify that there are 20 "words" in that file using some command-line utilities, and/or view the words: $ tr '[:lower:]' '[:upper:]' ----- ====== Assignment ====== You shall implement a module named ''scrabble'' containing definitions for various Scrabble®-related functions. Take care to accommodate the following: - The two dataset files must be read once and only once, when the module is imported/executed by the interpreter. Multiple calls to any of the required module functions shall not result in reading from those files multiple times. - The [[https://docs.python.org/3/library/doctest.html|doctests]] in the function docstrings will help explain how the functions are supposed to behave, and also constitute embedded runnable tests. To run the tests and get a summary of what passed/failed: python -m doctest scrabble.py # only produces output if any tests fail python -m doctest -v scrabble.py # gives a detailed summary of all passed/failed tests ===== Starter Code ===== """ Generator and other functions related to Scrabble® words. """ from io import TextIOBase # for type hints from typing import Iterable, Iterator # for type hints import itertools # suggested for permutations() and chain.from_iterable() import re # suggested for finditer() # You will want to establish some module-scoped variables representing objects with the information # from files /srv/datasets/scrabble-letter-values and /srv/datasets/scrabble-hybrid # so that the various functions below have access to them as necessary. def tokenize_words(file: TextIOBase) -> Iterator[str]: """ Tokenizes the contents of a text-file object (e.g. from open() or sys.stdin) and yields all contiguous sequences of English letters from the file, in uppercase. >>> from pprint import pprint >>> pprint(list(tokenize_words(open('/srv/datasets/party.txt'))), compact=True) ['THE', 'PARTY', 'TOLD', 'YOU', 'TO', 'REJECT', 'THE', 'EVIDENCE', 'OF', 'YOUR', 'EYES', 'AND', 'EARS', 'IT', 'WAS', 'THEIR', 'FINAL', 'MOST', 'ESSENTIAL', 'COMMAND'] >>> pprint(list(tokenize_words(open('/srv/datasets/phonewords-e.161.txt')))) ['ABC', 'DEF', 'GHI', 'JKL', 'MNO', 'PQRS', 'TUV', 'WXYZ'] """ pass # TODO def legal_words(words: Iterable[str]) -> Iterator[str]: """ Selects from an iterable collection of strings only those that are legal Scrabble® words. >>> from pprint import pprint >>> pprint(list(legal_words(tokenize_words(open('/srv/datasets/party.txt')))), compact=True) ['THE', 'PARTY', 'TOLD', 'YOU', 'TO', 'REJECT', 'THE', 'EVIDENCE', 'OF', 'YOUR', 'EYES', 'AND', 'EARS', 'IT', 'WAS', 'THEIR', 'FINAL', 'MOST', 'ESSENTIAL', 'COMMAND'] >>> pprint(list(legal_words(tokenize_words(open('/srv/datasets/phonewords-e.161.txt'))))) ['DEF', 'GHI'] >>> pprint(list(legal_words(tokenize_words(open('/srv/datasets/empty'))))) [] >>> list(legal_words(['all', 'in', 'lowercase'])) [] """ pass # TODO def word_score(word: str) -> int: """ Computes the sum of the tile values for a given word, or 0 if the word is illegal. >>> from pprint import pprint >>> pprint([(w, word_score(w)) for w in tokenize_words(open('/srv/datasets/party.txt'))], \ compact=True) [('THE', 6), ('PARTY', 10), ('TOLD', 5), ('YOU', 6), ('TO', 2), ('REJECT', 15), ('THE', 6), ('EVIDENCE', 14), ('OF', 5), ('YOUR', 7), ('EYES', 7), ('AND', 4), ('EARS', 4), ('IT', 2), ('WAS', 6), ('THEIR', 8), ('FINAL', 8), ('MOST', 6), ('ESSENTIAL', 9), ('COMMAND', 14)] >>> pprint([(w, word_score(w)) \ for w in tokenize_words(open('/srv/datasets/phonewords-e.161.txt'))], \ compact=True) [('ABC', 0), ('DEF', 7), ('GHI', 7), ('JKL', 0), ('MNO', 0), ('PQRS', 0), ('TUV', 0), ('WXYZ', 0)] >>> word_score('lowercase') 0 """ pass # TODO def highest_value_word(words: Iterable[str]) -> str: """ Selects from an iterable collection of strings the highest-valued Scrabble® word, as returned by the word_score() function. If multiple words are maximal, the function returns the first one encountered. If no words are present, raises an exception of some kind. >>> highest_value_word(tokenize_words(open('/srv/datasets/party.txt'))) 'REJECT' >>> highest_value_word(tokenize_words(open('/srv/datasets/phonewords-e.161.txt'))) 'DEF' >>> try: ... highest_value_word(tokenize_words(open('/srv/datasets/empty'))) ... except Exception as error: ... print('error') ... error """ pass # TODO def legal_tile_words(tiles: str) -> list[str]: """ Returns a sorted list of all the legal Scrabble® words that could be formed from the "tiles" represented by the argument string. >>> legal_tile_words('JTQHDEZ') ['DE', 'ED', 'EDH', 'EH', 'ET', 'ETH', 'HE', 'HET', 'JET', 'TE', 'TED', 'THE', 'ZED'] """ pass # TODO if __name__ == '__main__': # Print the total number of legal words on stdin and their total value in points, just for fun import sys (i1, i2) = itertools.tee(legal_words(tokenize_words(sys.stdin))) print(sum(1 for _ in i1), 'legal Scrabble words worth', sum(map(word_score, i2)), 'points') ===== Leaderboard ===== As submissions are received, this leaderboard will be updated with the top-performing fully/nearly functional solutions, with regard to execution speed.
RankTest Time (s)Memory Usage (kB)SLOC (lines)User
----- ====== Submission ====== Submit ''scrabble.py'' via [[info:turnin]]. {{https://jeff.cis.cabrillo.edu/images/feedback-robot.png?nolink }} //**Feedback Robot**// This project has a feedback robot that will run some tests on your submission and provide you with a feedback report via email within roughly one minute. Please read the feedback carefully. ====== Due Date and Point Value ====== Due at 23:59:59 on the date listed on the [[:syllabus|syllabus]]. ''Assignment 04'' is worth 60 points. Possible point values per category: --------------------------------------- Properly implemented functions 60 (split evenly) Possible deductions: Style and practices 10–20% Possible extra credit: Submission via Git 5% ---------------------------------------