Papyrological Navigator
one component of IDPSoftware
The PN is "a customized search engine ... capable of retrieving information from multiple related sites." It is a custom web application, prototyped at Columbia that is intended to replace the current production applications for APIS and DDBDP, and to provide access to the full content of HGV records. It is therefore envisioned as being capable of searching both metadata and texts, and displaying metadata records, texts and images pulled from all three source datasets. Links to other data sources, such as Trismegistos?, are planned for the near term. Content expansion, to incorporate other full datasets, is envisioned further down the road. The PN employs the portal metaphor.
Background on the PN, from Columbia.
Software and language components:
Lots of Java:
- Apache Tomcat
- Apache Jetspeed-2 portlet container
- portlet spec
- Velocity templates
- Apache derby supports the portlet container, but is not really part of the app
- Some javax.naming implementations to bridge the bits of a request served by the container to the portlet app
- Lucene
- For the PN, really just indexing code. The DDb has lots of fancy tokenizers, etc.
- eRez/FSI images serving software, and the flash plugin FSI licenses to view images
- a web-app that serves up XML config files for the Flash plugin based on the APIS id of the document in question
Modules/Portlets
Metadata Search Portlet
http://apptest.cul.columbia.edu:8082/navigator/portal/default-page.psml
This is the only component currently linked from the top level at papyri.info.
Image Portlet
The Image Portlet provides for the online viewing and limited manipulation of APIS-hosted images via the PN. This portlet is supported server-side by licensed proprietary software: eRez/FSI image server and Flash Plugin FSI. Prior development has taken advantage of institutional licenses at Columbia for these. Columbia's licenses will continue to cover production needs through July 2010. Development work at NYU may require access to separate licenses, and when NYU takes over production responsibilities in 2010, licenses will definitely be required.
Text Portlet
Experimental text search interface (not linked from main public page yet): http://appdev.cul.columbia.edu:8082/ddbdp-nav/search
Statistics regarding content indexed for same (also not public yet): http://appdev.cul.columbia.edu:8082/ddbdp-nav/stats
The Text Portlet (a.k.a. the DDbDP app) supports rendering and manipulation of DDbDP-originated papyrus texts in their original languages. No proprietary software is involved. The application takes advantage of SRU in its query interface. The back-end for search right now is actually based on receiving a valid CQL query ( http://www.loc.gov/standards/sru/specs/cql.html ) rather than the input from any particular HTML form per se. This means that the number of clauses is just a matter of the HTML having a way to accommodate the appropriate number of inputs.
Translation Portlet
??????? APIS translations. HGV translations.
Getting data into the PN
Workflow for refreshing data is still pretty much nonexistent, although automating it is an [IDP1] deliverable for June/July 2008. As of now, basically 3 huge XML files (apis data, hgv data, and a composite file, called 'aggregated' (http://epiduke.cch.kcl.ac.uk/aggregated/), that has a mapped subset of the previously mentioned) are indexed by Lucene. At some point in the near future will probably move to putting a harvestable interface to APIS data up instead of relying on the Big Xml File. In any case, that's the stuff that goes into the PN as of now.
Specific PN Software Development Tasks
- Interface Assessment: "conduct usability testing in order to identify areas of possible improvement"
- Enhancements to new (Greek) lemmatized and proximity searching
- the refinement of combined metadata and text searching functionality
- mechanisms for automatic ingestion of updated EpiDoc / UNICODE texts and metadata from the DDbDP and HGV
- implementation of basic EpiDoc support for APIS translations
- investigation and, if feasible, implementation of XML-based interoperability with non-APIS digital projects like the German Papyrus Portal and databases involved in Trismegistos
Documents:
- In the APIS Phase 6 proposal (pdf), see:
- Section 5.1: Interface Assessment, PDF document pages 21-22
- Section 5.2.1: Papyrological Navigator (papyri.info), PDF document pages 22-23.
- Appendix D: APIS/Papyrological Navigator Work Plan Details, PDF document pages 39-45
Source code
Not yet imported into the IDP SVN repository.
Related Milestones
- milestone:"PN Code Handover" (see details at CodeHandover)
Related tickets (query)
See also: NavigatorProgress
- #2
- Aggregation incomplete
- #5
- HGV publication numbers with spans being parsed incorrectly
- #6
- Synthesized HGV metadata must not be displayed as HGV without caveat
- #8
- Final page of search results shows incorrect summary information
- #9
- HGV Translation portlet's SOURCE link goes to HGV metadata
- #10
- DDb searches within Series that have NO VOLUMES are not working
- #23
- Papyrologists close down PN google doc
- #36
- Revisit and prepare Mapping file for Leuven
- #37
- Ensure that there is an image viewing tool for ISAW-hosted PN
- #38
- Text export option
- #41
- Entry with Unicode characters in PN
- #42
- Browsers and PN
- #43
- Standardize sorted display of publications in PN result sets
- #44
- Use of grc-Grek in PN display
- #45
- Missing texts, literary texts, collaboratiion with LDAB
- #47
- Results screen should display line numbers in regular numerical order
- #48
- Allow error reporting on every page of the PN.
- #50
- Only thumbnail images in PN
- #54
- spec storage requirements for PN/NS footprint at NYU/diglib
- #56
- Change publication search in PN to fall back to a prefix search
- #69
- consolidate/document IDP2 and Concordia PN tasks for PN programmer
- #75
- report on how PN gets IDP content now
- #92
- Institute maven build system for PN
- #93
- Set up NYU maven repository
- #94
- Reorganize PN projects in the svn repo
- #95
- Port pn-metadata-indexers to maven
- #96
- Port pn-ddbdp-indexers project to maven
- #98
- Document#parseName may be ambiguous/incorrect
- #102
- Number server indexing source fields
- #103
- Number server x is related to y functionality
- #104
- Publication Series Drop Down Menu Fix
- #105
- Links between Static and PN views
- #107
- XML filenames with a '+' literal are being incorrectly decoded
- #108
- PN hierarchal browsing
- #109
- PN linear browsing
- #110
- PN Tab overhaul
- #112
- PN Tabs --> Collapsible portlet view???
- #113
- PN user profiles
- #114
- Tab Reform
- #115
- DDBDP Search tab reform
- #116
- Numbers Search tab reform
- #117
- Combined and Rationalized Search tabs
- #118
- PN Tab overhaul: News & Updates
- #119
- Add Linkage to digitized editions elsewhere on web
- #120
- Link out to Trismegistos from PN
- #123
- PN to SoSOL linking
- #125
- clean, stable URLs for PN content
- #126
- GAWDly Atom feeds for PN content
- #127
- address assessment of other projects
- #129
- Window in PN for communications
- #132
- Port pn-numbers to maven
- #135
- Refactor tests for pn-ddbdp-indexers
- #140
- Columbia PN: make DEV public
- #141
- port pn-jndi to maven
- #142
- Port pn-ddbdp-portlets project to maven
- #143
- Port pn-metadata-portlets to maven
- #144
- Refactor transcoder for maven
- #152
- Port jetspeed-navigator to maven
- #162
- Set up data locations for PN
- #164
- Debug PN
- #187
- Sort out deployment method
- #188
- Greek search does not return highlighted results
- #189
- DDBDP info appears in HGV Translation portlet
- #190
- Unicode fraction display
- #191
- Lemmatized Searching not functionin gin NYU PN
- #192
- NYU PN: substring searching weirdness
- #193
- NYU PN vs Col PN different number of returns on same DDBDP search
- #194
- NYU PN HTML display of DDBDP texts
- #195
- NYU PN vs Col PN different number of returns on same APIS search
- #196
- NYU PN vs Col PN different number of returns on same Numbers search
- #197
- Unicode unclear fraction display
- #213
- DDbDP search page off by one
- #221
- Create second tomcat instance for PN
- #227
- Incorrect linkage
- #228
- Incorrect Display of Search Results
- #229
- Proximity Searching Errors
- #230
- The limiters AND and NOT apparently not working
- #231
- NYU PN: Search link on main page
- #234
- Duplicate HGV metadata
- #236
- "Clear Form" on DDbDP Search Page Disables "input in beta code"
- #249
- PN HGV metadata
- #251
- Leiden missing from Greek in "initial results"
- #252
- Metadata appearing in Greek text box on "initial results" page
- #254
- Incomplete Search Results (choice vs app)
- #255
- In IE Greek dropping out of PN
- #266
- PN user experience / information architecture: data layout
- #267
- PN user experience / information architecture: search forms
- #268
- PN user experience / information architecture: Search Results
- #269
- PN user experience / information architecture: 'Revise Search'
- #270
- PN user experience / information architecture: basic / advanced search forms
- #283
- PN HGV metadata "Publication" field
- #289
- Query Parser for search strings
- #299
- Check into impact of <lb type="worddiv">
- #306
- Add Latin lemmatized searching
- #310
- encoding error in Apis description (Qur'an)
- #311
- Normalize fonts for site
- #312
- Some Latin texts dropping out
- #313
- "Clear Form" not working in DDBDP Search
- #314
- Text of P.Wisc, II 59 missing from PN
- #319
- Modify XSLTs to generate new PN view pages
- #321
- Deploy new static views in place of PN views
- #322
- Unify metadata and full text search indices
- #323
- Unify search result views
- #379
- add "I'm interested" mechanism to SoSOL/PN for APIS records
- #380
- image viewer features
- #381
- incorporate BYU multispectral image viewing (manipulation?) into PN
- #383
- Greek Wildcard searching in PN not working
- #389
- New Image Server
- #390
- Reorganize PN XSLT to be separate from EpiDoc XSLT
- #396
- Concordia Testing
- #428
- Integrate APIS EpiDoc records into PN indexing workflow
- #473
- Surface old collection.xml equivalencies for posterity
![(please configure the [header_logo] section in trac.ini)](/trac/idp/chrome/site/your_project_logo.png)