Changeset 153
- Timestamp:
- 05/20/08 21:55:21 (3 months ago)
- Files:
-
- trunk/CHANGELOG.txt (modified) (1 diff)
- trunk/MANIFEST.in (modified) (1 diff)
- trunk/contrib/move_texts.py (added)
- trunk/shakespeare/model/dm.py (modified) (1 diff)
- trunk/shakespeare/tests/test_model.py (modified) (1 diff)
- trunk/shksprdata/texts/all_is_well_that_ends_well_gut_f.txt (added)
- trunk/shksprdata/texts/alls_well_that_ends_well_gut.txt (added)
- trunk/shksprdata/texts/anthonie_and_cleopatra_gut_f.txt (added)
- trunk/shksprdata/texts/antony_and_cleopatra_gut.txt (added)
- trunk/shksprdata/texts/as_you_like_it_gut.txt (added)
- trunk/shksprdata/texts/as_you_like_it_gut_f.txt (added)
- trunk/shksprdata/texts/comedy_of_errors_gut.txt (added)
- trunk/shksprdata/texts/comedy_of_errours_gut_f.txt (added)
- trunk/shksprdata/texts/coriolanus_gut.txt (added)
- trunk/shksprdata/texts/coriolanus_gut_f.txt (added)
- trunk/shksprdata/texts/cymbeline_gut.txt (added)
- trunk/shksprdata/texts/cymbeline_gut_f.txt (added)
- trunk/shksprdata/texts/hamlet_gut.txt (added)
- trunk/shksprdata/texts/hamlet_gut_f.txt (added)
- trunk/shksprdata/texts/henry_iv_part_1_gut.txt (added)
- trunk/shksprdata/texts/henry_iv_part_1_gut_f.txt (added)
- trunk/shksprdata/texts/henry_iv_part_2_gut.txt (added)
- trunk/shksprdata/texts/henry_iv_part_2_gut_f.txt (added)
- trunk/shksprdata/texts/henry_v_gut.txt (added)
- trunk/shksprdata/texts/henry_v_gut_f.txt (added)
- trunk/shksprdata/texts/henry_vi_part_1_gut.txt (added)
- trunk/shksprdata/texts/henry_vi_part_1_gut_f.txt (added)
- trunk/shksprdata/texts/henry_vi_part_2_gut.txt (added)
- trunk/shksprdata/texts/henry_vi_part_2_gut_f.txt (added)
- trunk/shksprdata/texts/henry_vi_part_3_gut.txt (added)
- trunk/shksprdata/texts/henry_vi_part_3_gut_f.txt (added)
- trunk/shksprdata/texts/henry_viii_gut.txt (added)
- trunk/shksprdata/texts/henry_viii_gut_f.txt (added)
- trunk/shksprdata/texts/john_gut.txt (added)
- trunk/shksprdata/texts/john_gut_f.txt (added)
- trunk/shksprdata/texts/julius_caesar_gut.txt (added)
- trunk/shksprdata/texts/julius_caesar_gut_f.txt (added)
- trunk/shksprdata/texts/lear_gut.txt (added)
- trunk/shksprdata/texts/lear_gut_f.txt (added)
- trunk/shksprdata/texts/lovers_complaint_gut.txt (added)
- trunk/shksprdata/texts/loves_labour_lost_gut_f.txt (added)
- trunk/shksprdata/texts/loves_labours_lost_gut.txt (added)
- trunk/shksprdata/texts/macbeth_gut.txt (added)
- trunk/shksprdata/texts/macbeth_gut_f.txt (added)
- trunk/shksprdata/texts/measure_for_measure_gut.txt (added)
- trunk/shksprdata/texts/measure_for_measure_gut_f.txt (added)
- trunk/shksprdata/texts/merchant_of_venice_gut.txt (added)
- trunk/shksprdata/texts/merchant_of_venice_gut_f.txt (added)
- trunk/shksprdata/texts/merry_wives_of_windsor_gut.txt (added)
- trunk/shksprdata/texts/merry_wives_of_windsor_gut_f.txt (added)
- trunk/shksprdata/texts/metadata.txt (added)
- trunk/shksprdata/texts/midsummer_nights_dream_gut.txt (added)
- trunk/shksprdata/texts/midsummer_nights_dreame_gut_f.txt (added)
- trunk/shksprdata/texts/much_ado_about_nothing_gut.txt (added)
- trunk/shksprdata/texts/much_ado_about_nothing_gut_f.txt (added)
- trunk/shksprdata/texts/othello_gut.txt (added)
- trunk/shksprdata/texts/othello_gut_f.txt (added)
- trunk/shksprdata/texts/passionate_pilgrim_gut.txt (added)
- trunk/shksprdata/texts/pericles_gut.txt (added)
- trunk/shksprdata/texts/phoenix_and_the_turtle_gut.txt (added)
- trunk/shksprdata/texts/rape_of_lucrece_gut.txt (added)
- trunk/shksprdata/texts/richard_ii_gut.txt (added)
- trunk/shksprdata/texts/richard_ii_gut_f.txt (added)
- trunk/shksprdata/texts/richard_iii_gut.txt (added)
- trunk/shksprdata/texts/richard_iii_gut_f.txt (added)
- trunk/shksprdata/texts/romeo_and_juliet_gut.txt (added)
- trunk/shksprdata/texts/romeo_and_juliet_gut_f.txt (added)
- trunk/shksprdata/texts/sonnets_gut.txt (added)
- trunk/shksprdata/texts/taming_of_the_shrew_gut.txt (added)
- trunk/shksprdata/texts/taming_of_the_shrew_gut_f.txt (added)
- trunk/shksprdata/texts/tempest_gut.txt (added)
- trunk/shksprdata/texts/tempest_gut_f.txt (added)
- trunk/shksprdata/texts/timon_of_athens_gut.txt (added)
- trunk/shksprdata/texts/timon_of_athens_gut_f.txt (added)
- trunk/shksprdata/texts/titus_andronicus_gut_f.txt (added)
- trunk/shksprdata/texts/tragedy_of_titus_andronicus_gut.txt (added)
- trunk/shksprdata/texts/troilus_and_cressida_gut.txt (added)
- trunk/shksprdata/texts/twelfe-night_gut_f.txt (added)
- trunk/shksprdata/texts/twelfth_night_gut.txt (added)
- trunk/shksprdata/texts/two_gentlemen_of_verona_gut.txt (added)
- trunk/shksprdata/texts/two_gentlemen_of_verona_gut_f.txt (added)
- trunk/shksprdata/texts/winters_tale_gut.txt (added)
- trunk/shksprdata/texts/winters_tale_gut_f.txt (added)
Legend:
- Unmodified
- Added
- Removed
- Modified
- Copied
- Moved
trunk/CHANGELOG.txt
Revision 127 Revision 153 1 v0.5: 2008-05-10 2 ================ 3 4 * Move to Pylons and rework web interface 5 * Move command line interface to use pastescript 6 * Now have Milton in addition to Shakespeare 7 * Store copies of texts in package (shksprdata) rather than downloading. 8 1 v0.4: 2007-04-16 9 v0.4: 2007-04-16 2 ================ 10 ================ 3 11 4 * Annotation of texts (js-based in browser) (ticket:20, ticket:21) 12 * Annotation of texts (js-based in browser) (ticket:20, ticket:21) 5 (<http://www.openshakespeare.org/2007/04/10/annotation-is-working/>) 13 (<http://www.openshakespeare.org/2007/04/10/annotation-is-working/>) 6 * Switch to unicode for internal string handling (resolves ticket:23: some 14 * Switch to unicode for internal string handling (resolves ticket:23: some 7 texts breaking the viewer) 15 texts breaking the viewer) 8 * Add functional tests for the web interface (ticket:11) 16 * Add functional tests for the web interface (ticket:11) 9 * Substantial improvements to speed of concordance (ticket:22) 17 * Substantial improvements to speed of concordance (ticket:22) 10 (<http://www.openshakespeare.org/2007/01/03/improvements-to-the-concordance/>) 18 (<http://www.openshakespeare.org/2007/01/03/improvements-to-the-concordance/>) 11 * Switch to genshi templates from kid 19 * Switch to genshi templates from kid 12 * Switch to plain WSGI from cherrypy 20 * Switch to plain WSGI from cherrypy 13 21 14 Outstanding Issues 22 Outstanding Issues 15 ------------------ 23 ------------------ 16 24 17 * Annotation cannot handle long texts because of javascript performance 25 * Annotation cannot handle long texts because of javascript performance 18 issues 26 issues 19 27 20 28 21 v0.3: 2006-10-04 29 v0.3: 2006-10-04 22 ================ 30 ================ 23 31 24 * Can now view mutiple texts side by side (ticket:15). See it in action at: 32 * Can now view mutiple texts side by side (ticket:15). See it in action at: 25 <http://demo.openshakespeare.org/view?name=othello_gut_f+othello_gut> 33 <http://demo.openshakespeare.org/view?name=othello_gut_f+othello_gut> 26 * Now include moby/bosak versions of shakespeare as well as gutenberg 34 * Now include moby/bosak versions of shakespeare as well as gutenberg 27 (ticket:10) (though more work remains to be done to process these versions 35 (ticket:10) (though more work remains to be done to process these versions 28 to plaintext and html) 36 to plaintext and html) 29 * Fix bug whereby we were missing some of the available gutenberg texts 37 * Fix bug whereby we were missing some of the available gutenberg texts 30 (ticket:18) 38 (ticket:18) 31 * Install the shakespeare python package (ticket:16) 39 * Install the shakespeare python package (ticket:16) 32 * Move to py.test from unittest 40 * Move to py.test from unittest 33 * New project website at <http://www.openshakespeare.org/> 41 * New project website at <http://www.openshakespeare.org/> 34 42 35 Outstanding Issues 43 Outstanding Issues 36 ------------------ 44 ------------------ 37 45 38 * Several of the source texts (all of them Gutenberg folios) seem to 46 * Several of the source texts (all of them Gutenberg folios) seem to 39 break the viewer due to kid (the templating system) complaining about about 47 break the viewer due to kid (the templating system) complaining about about 40 'not well-formed (invalid token) xml'. Any help in tracking this down would 48 'not well-formed (invalid token) xml'. Any help in tracking this down would 41 be greatly appreciated. 49 be greatly appreciated. 42 50 43 51 44 v0.2 2006-07-16 52 v0.2 2006-07-16 45 =============== 53 =============== 46 54 47 * Database backend with proper domain model (ticket:6) 55 * Database backend with proper domain model (ticket:6) 48 * Text snippets in concordance system and links through to source (ticket:12) 56 * Text snippets in concordance system and links through to source (ticket:12) 49 * Sources document (ticket:5) 57 * Sources document (ticket:5) trunk/MANIFEST.in
Revision 148 Revision 153 1 recursive-include shakespeare/public * 1 recursive-include shakespeare/public * 2 recursive-include shakespeare/templates * 2 recursive-include shakespeare/templates * 3 recursive-include shksprdata trunk/shakespeare/model/dm.py
Revision 150 Revision 153 1 """ 1 """ 2 Domain model 2 Domain model 3 3 4 Material contains all data we have including shakespeare texts. A text is taken 4 Material contains all data we have including shakespeare texts. A text is taken 5 to be a specific version of a work. e.g. the 1623 folio of King Richard III. 5 to be a specific version of a work. e.g. the 1623 folio of King Richard III. 6 6 7 We may in future add a Work object to refer to 'abstract' work of which a given 7 We may in future add a Work object to refer to 'abstract' work of which a given 8 text is a version. 8 text is a version. 9 """ 9 """ 10 import sqlobject 10 import sqlobject 11 11 12 # make sure config is registered 12 # make sure config is registered 13 import shakespeare 13 import shakespeare 14 shakespeare.conf() 14 shakespeare.conf() 15 15 16 from pylons.database import PackageHub 16 from pylons.database import PackageHub 17 hub = PackageHub('shakespeare') 17 hub = PackageHub('shakespeare') 18 sqlobject.sqlhub.processConnection = hub.getConnection() 18 sqlobject.sqlhub.processConnection = hub.getConnection() 19 19 20 import shakespeare 20 import shakespeare 21 import shakespeare.cache 21 import shakespeare.cache 22 22 23 # import other sqlobject items 23 # import other sqlobject items 24 from annotater.model import Annotation 24 from annotater.model import Annotation 25 import annotater.model 25 import annotater.model 26 26 27 # note we run this at bottom of module to auto create db tables on import 27 # note we run this at bottom of module to auto create db tables on import 28 def createdb(): 28 def createdb(): 29 Material.createTable(ifNotExists=True) 29 Material.createTable(ifNotExists=True) 30 Concordance.createTable(ifNotExists=True) 30 Concordance.createTable(ifNotExists=True) 31 Statistic.createTable(ifNotExists=True) 31 Statistic.createTable(ifNotExists=True) 32 annotater.model.createdb() 32 annotater.model.createdb() 33 33 34 def cleandb(): 34 def cleandb(): 35 Statistic.dropTable(ifExists=True) 35 Statistic.dropTable(ifExists=True) 36 Concordance.dropTable(ifExists=True) 36 Concordance.dropTable(ifExists=True) 37 Material.dropTable(ifExists=True) 37 Material.dropTable(ifExists=True) 38 annotater.model.cleandb() 38 annotater.model.cleandb() 39 39 40 def rebuilddb(): 40 def rebuilddb(): 41 cleandb() 41 cleandb() 42 createdb() 42 createdb() 43 43 44 class Material(sqlobject.SQLObject): 44 class Material(sqlobject.SQLObject): 45 """Material related to Shakespeare (usually text of works and ancillary 45 """Material related to Shakespeare (usually text of works and ancillary 46 matter such as introductions). 46 matter such as introductions). 47 47 48 NB: can not use 'text' as class name as it is an sql reserved word 48 NB: can not use 'text' as class name as it is an sql reserved word 49 49 50 @attribute name: a unique name identifying the material 50 @attribute name: a unique name identifying the material 51 51 52 TODO: mutiple creators ?? 52 TODO: mutiple creators ?? 53 """ 53 """ 54 54 55 name = sqlobject.StringCol(alternateID=True) 55 name = sqlobject.StringCol(alternateID=True) 56 title = sqlobject.StringCol(default=None, length=255) 56 title = sqlobject.StringCol(default=None, length=255) 57 # creator rather than author to fit with dublin core 57 # creator rather than author to fit with dublin core 58 creator = sqlobject.StringCol(default=None, length=255) 58 creator = sqlobject.StringCol(default=None, length=255) 59 url = sqlobject.StringCol(default=None, length=255) 59 url = sqlobject.StringCol(default=None, length=255) 60 notes = sqlobject.StringCol(default=None) 60 notes = sqlobject.StringCol(default=None) 61 61 62 def get_cache_path(self, format): 62 def get_cache_path(self, format): 63 """Get path within cache to data file associated with this material. 63 """Get path within cache to data file associated with this material. 64 @format: the version ('plain', original='' etc) 64 @format: the version ('plain', original='' etc) 65 """ 65 """ 66 return shakespeare.cache.default.path(self.url, format) 66 return shakespeare.cache.default.path(self.url, format) 67 67 68 def get_store_fileobj(self): 69 import pkg_resources 70 pkg = 'shksprdata' 71 # default to plain txt format (TODO: generalise this) 72 path = 'texts/%s.txt' % self.name 73 fileobj = pkg_resources.resource_stream(pkg, path) 74 return fileobj 75 76 68 class Concordance(sqlobject.SQLObject): 77 class Concordance(sqlobject.SQLObject): 69 78 70 text = sqlobject.ForeignKey('Material') 79 text = sqlobject.ForeignKey('Material') 71 word = sqlobject.StringCol(length=50) 80 word = sqlobject.StringCol(length=50) 72 line = sqlobject.IntCol() 81 line = sqlobject.IntCol() 73 char_index = sqlobject.IntCol() 82 char_index = sqlobject.IntCol() 74 83 75 word_index = sqlobject.DatabaseIndex('word') 84 word_index = sqlobject.DatabaseIndex('word') 76 text_index = sqlobject.DatabaseIndex('text') 85 text_index = sqlobject.DatabaseIndex('text') 77 86 78 class Statistic(sqlobject.SQLObject): 87 class Statistic(sqlobject.SQLObject): 79 88 80 text = sqlobject.ForeignKey('Material') 89 text = sqlobject.ForeignKey('Material') 81 word = sqlobject.StringCol(length=50) 90 word = sqlobject.StringCol(length=50) 82 occurrences = sqlobject.IntCol(default=1) 91 occurrences = sqlobject.IntCol(default=1) 83 92 84 word_index = sqlobject.DatabaseIndex('word') 93 word_index = sqlobject.DatabaseIndex('word') 85 text_index = sqlobject.DatabaseIndex('text') 94 text_index = sqlobject.DatabaseIndex('text') 86 95 87 96 88 # auto create db tables on import 97 # auto create db tables on import 89 createdb() 98 createdb() 90 99 trunk/shakespeare/tests/test_model.py
Revision 150 Revision 153 1 import sqlobject 1 import sqlobject 2 2 3 import shakespeare.model as model 3 import shakespeare.model as model 4 4 5 class TestMaterial(object): 5 class TestMaterial(object): 6 6 7 @classmethod 7 @classmethod 8 def setup_class(self): 8 def setup_class(self): 9 self.name = 'test-123' 9 self.name = 'test-123' 10 self.title = 'Hamlet' 10 self.title = 'Hamlet' 11 self.url = 'http://www.openshakespeare.org/blah.txt' 11 self.url = 'http://www.openshakespeare.org/blah.txt' 12 self.text = model.Material(name=self.name, 12 self.text = model.Material(name=self.name, 13 title=self.title, url=self.url) 13 title=self.title, url=self.url) 14 14 15 @classmethod 15 @classmethod 16 def teardown_class(self): 16 def teardown_class(self): 17 model.Material.delete(self.text.id) 17 model.Material.delete(self.text.id) 18 18 19 def test1(self): 19 def test1(self): 20 txtid = self.text.id 20 txtid = self.text.id 21 txt2 = model.Material.get(txtid) 21 txt2 = model.Material.get(txtid) 22 txt3 = model.Material.byName(self.name) 22 txt3 = model.Material.byName(self.name) 23 assert self.text.id == txt2.id 23 assert self.text.id == txt2.id 24 assert self.text.id == txt3.id 24 assert self.text.id == txt3.id 25 25 26 def test_get_cache_path(self): 26 def test_get_cache_path(self): 27 out = self.text.get_cache_path('plain') 27 out = self.text.get_cache_path('plain') 28 # do not want anything too specific or we end up duplicating cache_test 28 # do not want anything too specific or we end up duplicating cache_test 29 assert len(out) > 0 29 assert len(out) > 0 30 31 def test_get_store_fileobj(self): 32 text = model.Material.byName('phoenix_and_the_turtle_gut') 33 out = text.get_store_fileobj() 34 out = out.read() 35 assert len(out) > 0 36 assert out[:26] == 'THE PHOENIX AND THE TURTLE' 37 30 38 31 class TestConcordance(object): 39 class TestConcordance(object): 32 40 33 @classmethod 41 @classmethod 34 def setup_class(self): 42 def setup_class(self): 35 self.name = 'test-123' 43 self.name = 'test-123' 36 self.title = 'Hamlet' 44 self.title = 'Hamlet' 37 self.text = model.Material(name=self.name, title=self.title) 45 self.text = model.Material(name=self.name, title=self.title) 38 word = 'jones' 46 word = 'jones' 39 line = 20 47 line = 20 40 char_index = 500 48 char_index = 500 41 self.cc1 = model.Concordance(text=self.text, 49 self.cc1 = model.Concordance(text=self.text, 42 word=word, 50 word=word, 43 line=line, 51 line=line, 44 char_index=char_index) 52 char_index=char_index) 45 53 46 @classmethod 54 @classmethod 47 def teardown_class(self): 55 def teardown_class(self): 48 model.Concordance.delete(self.cc1.id) 56 model.Concordance.delete(self.cc1.id) 49 model.Material.delete(self.text.id) 57 model.Material.delete(self.text.id) 50 58 51 def test1(self): 59 def test1(self): 52 out1 = model.Concordance.get(self.cc1.id) 60 out1 = model.Concordance.get(self.cc1.id) 53 assert self.text == out1.text 61 assert self.text == out1.text 54 62 55 class TestStatistic: 63 class TestStatistic: 56 64 57 @classmethod 65 @classmethod 58 def setup_class(self): 66 def setup_class(self): 59 self.name = 'test-123' 67 self.name = 'test-123' 60 self.title = 'Hamlet' 68 self.title = 'Hamlet' 61 self.text = model.Material(name=self.name, title=self.title) 69 self.text = model.Material(name=self.name, title=self.title) 62 self.word = 'jones' 70 self.word = 'jones' 63 self.occurrences = 5 71 self.occurrences = 5 64 self.cc1 = model.Statistic( 72 self.cc1 = model.Statistic( 65 text=self.text, 73 text=self.text, 66 word=self.word, 74 word=self.word, 67 occurrences=self.occurrences 75 occurrences=self.occurrences 68 ) 76 ) 69 77 70 @classmethod 78 @classmethod 71 def teardown_class(self): 79 def teardown_class(self): 72 model.Statistic.delete(self.cc1.id) 80 model.Statistic.delete(self.cc1.id) 73 model.Material.delete(self.text.id) 81 model.Material.delete(self.text.id) 74 82 75 def test1(self): 83 def test1(self): 76 out1 = model.Statistic.get(self.cc1.id) 84 out1 = model.Statistic.get(self.cc1.id) 77 assert self.text == out1.text 85 assert self.text == out1.text 78 assert out1.occurrences == self.occurrences 86 assert out1.occurrences == self.occurrences 79 87 80 def test_select(self): 88 def test_select(self): 81 tresults = model.Statistic.select( 89 tresults = model.Statistic.select( 82 sqlobject.AND( 90 sqlobject.AND( 83 model.Statistic.q.textID == self.text.id, 91 model.Statistic.q.textID == self.text.id, 84 model.Statistic.q.word == self.word, 92 model.Statistic.q.word == self.word, 85 )) 93 )) 86 num = tresults.count() 94 num = tresults.count() 87 assert num == 1 95 assert num == 1 88 96
