tag:blogger.com,1999:blog-36645031238858916732024-03-16T00:32:14.003-07:00XForms for ArchivesMethodologies for applying XForms and linked data to archival resourcesEthan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.comBlogger59125tag:blogger.com,1999:blog-3664503123885891673.post-40669015370926227222020-09-01T13:03:00.003-07:002020-09-01T13:12:40.155-07:00First pass mapping EAC-CPF to Linked Art JSON-LD<p>After pushing updates to map people and organization concepts in <a href="http://nomisma.org">Nomisma.org</a> in Linked Art-compliant JSON, I have implemented a similar serialization into <a href="https://github.com/ewg118/xEAC">xEAC</a>, the open source authority management framework, based on EAC-CPF, that I have been developing on and off since 2012.</p><p>Like Nomisma and <a href="https://github.com/ewg118/numishare/">Numishare</a> projects, an HTTP request for an authority URI includes Link headers that include alternate RDF, Turtle, and JSON-LD serializations for that resource, including a JSON-LD serialization following the <a href="https://linked.art/ns/v1/linked-art.json">Linked Art profile</a>. A capable JSON-LD parser can convert this profile into other serializations of RDF (XML, Turtle, etc.) according to the CIDOC-CRM ontology that other semantic web developers might more readily recognize.<br /></p><p>It is therefore possible to request the Linked Art JSON-LD via content negotiation from xEAC, for example:</p>
<pre><code>curl -H "Accept: application/ld+json;profile=\"https://linked.art/ns/v1/linked-art.json\""
http://numismatics.org/authority/adams_edgar</code></pre>
<br /><p>The Accept header is then parsed by the XProc pipeline in a controller that reads the content-type and profile in order to choose which serialization to enact. In this case, the EAC-CPF document is transformed via an <a href="https://github.com/ewg118/xEAC/blob/master/ui/xslt/serializations/eac/linkedart-json-ld.xsl">XSLT stylesheet</a> into an intermediate XML document that represents a JSON structure of objects and arrays, which is subsequently transformed by a secondary <a href="https://github.com/ewg118/xEAC/blob/master/ui/xslt/serializations/json/json-metamodel.xsl">XSLT stylesheet</a> into a text output, to which the XProc pipeline attaches an `application/ld+json` content-type in the HTTP header. This JSON metamodel approach has been applied throughout many of my frameworks, including Nomisma.org and Numishare, in order to consistently transform various XML schemas into different JSON profiles, from Linked Art to GeoJSON to the model required by d3js for data visualization.</p><p>The mapping of EAC-CPF for people, corporate bodies, and families follows the specifications for people and organizations drafted by the Linked Art community, at <a href="https://linked.art/model/actor/">https://linked.art/model/actor/</a>. A representation of a person in the ANS archival authority system, <a href="http://numismatics.org/authority/adams_edgar">Edgar H. Adams</a>, includes the preferred name and biographical statement (eac:biogHist/eac:abstract), URIs for matching concepts (a xEAC-specific implementation of eac:identity/eac:entityId[@localType = 'skos:exactMatch']), birth/death (for people) and formed_by/dissolved_by (for families and corporate bodies) dates from eac:existDates, and member/member_of links to URIs that implement relevant <a href="https://www.w3.org/TR/vocab-org/">W3C Org ontology</a> properties to the @xlink:arcrole in an eac:cpfRelation (also a specific xEAC implementation to align EAC-CPF more directly with Linked Open Data principles).</p>
<code><pre>{
"@context": "https://linked.art/ns/v1/linked-art.json",
"id": "http://numismatics.org/authority/adams_edgar",
"type": "Person",
"_label": "Adams, Edgar H. (Edgar Holmes), 1868-1940",
"identified_by": [
{
"type": "Name",
"content": "Adams, Edgar H. (Edgar Holmes), 1868-1940",
"classified_as": [
{
"id": "http://vocab.getty.edu/aat/300404670",
"type": "Type",
"_label": "Primary Name"
}
]
}
],
"exact_match": [
"http://viaf.org/viaf/92956241",
"http://d-nb.info/gnd/101883196",
"http://dbpedia.org/resource/Edgar_Adams",
"http://www.wikidata.org/entity/Q3719031",
"http://id.loc.gov/authorities/names/n81061401",
"http://n2t.net/ark:/99166/w6n03w0m"
],
"born": {
"type": "Birth",
"_label": "Start Date",
"timespan": {
"type": "TimeSpan",
"begin_of_the_begin": "1868-04-07",
"end_of_the_end": "1868-04-07"
}
},
"died": {
"type": "Death",
"_label": "End Date",
"timespan": {
"type": "TimeSpan",
"begin_of_the_begin": "1940-05-05",
"end_of_the_end": "1940-05-05"
}
},
"referred_to_by": [
{
"type": "LinguisticObject",
"content": "Edgar H. Adams (1868-1940) of Bayville, Oyster Bay, and Brooklyn, <code></code>
New York, was a numismatic scholar, author, and collector who produced, among <code></code>
other works, reference guides to territorial and private gold coins. He also <code></code>
coauthored, with William H. Woodin, the book United States Pattern, Trial, <code></code>
and Experimental Pieces, a standard reference work on pattern coins. He served <code></code>
as editor of The Numismatist, the monthly journal of the American Numismatic <code></code>
Association, wrote a numismatic column for the New York Sun newspaper, and <code></code>
was a co-founder of the New York Numismatic Club (1908).",
"classified_as": [
{
"type": "Type",
"id": "http://vocab.getty.edu/aat/300435422",
"_label": "Biography Statement",
"classified_as": [
{
"id": "http://vocab.getty.edu/aat/300418049",
"type": "Type",
"_label": "Brief Text"
}
]
}
]
}
],
"member_of": [
{
"type": "Group",
"id": "http://numismatics.org/authority/new_york_numismatic_club",
"_label": "New York Numismatic Club"
},
{
"type": "Group",
"id": "http://viaf.org/viaf/157729460",
"_label": "American Numismatic Association"
}
]
}</pre></code>
<p>Ideally, we would want to be able to include links to geographic resources for places of birth or death, occupations, and other events as machine-readable data, with actionable xs: dates and references to controlled vocabulary URIs. Some of this is already possible within xEAC because it was built from the ground up to interact with LOD resources, but projects like <a href="https://snaccooperative.org/">Social Networks and Archival Context</a> (SNAC) aren't yet well-integrated with external resources.<br /></p>Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-1730805353736182922020-03-04T11:56:00.003-08:002020-03-04T11:56:22.405-08:00270 hoard documents and 60 authorites added to the ANS Archives<div dir="ltr" style="text-align: left;" trbidi="on">
In a major digital archival publication today, 270 documents pertaining to <a href="http://coinhoards.org/">Greek coin hoards</a> have been added into the ANS Digital Archives, <a href="http://numismatics.org/archives/">Archer</a>, and 60 new archival authorities have been added into the <a href="http://numismatics.org/authorities/">ANS Biographies</a> (EAC-CPF records published in <a href="https://github.com/ewg118/xEAC">xEAC</a>). These authorities include numerous prominent numismatists, archaeologists, dealers, and collectors, as well as some individuals who are not prominent--people only attested through our archives and a scant provenance records from other museums. Each of these authorities will be created or updated in the <a href="https://snaccooperative.org/">Social Networks and Archival Context</a> (SNAC) project, along with links back to our archival records.<br />
<br />
A nice example is <a href="http://numismatics.org/authority/evans_arthur">Sir Arthur Evans</a>, the famous archaeologist of Knossos. He is mentioned in several letters between Sidney Noe and other scholars. Although Evans is not a prominent scholar in our own archives, his papers are held in other institutions. We are able to make our few letters more broadly available to researchers interested in Arthur Evans through SNAC.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKhPIUL251dE_hzwA6jaQ6cRwjMaLhDWf-Vie4oyvWn0X4IqDaQrf0rn-Fp4eYnbBs4tjG4RTe4RanEjDVSAgaqRlk0NVryw_PjmQnjCtwFA8xSf_Olhd3yu4toOUHpXpovQE2HLmoiCA/s1600/Screenshot+from+2020-03-04+14-36-59.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="695" data-original-width="1600" height="172" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKhPIUL251dE_hzwA6jaQ6cRwjMaLhDWf-Vie4oyvWn0X4IqDaQrf0rn-Fp4eYnbBs4tjG4RTe4RanEjDVSAgaqRlk0NVryw_PjmQnjCtwFA8xSf_Olhd3yu4toOUHpXpovQE2HLmoiCA/s400/Screenshot+from+2020-03-04+14-36-59.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The record for Arthur Evans, with links to hoard documents.</td></tr>
</tbody></table>
<br />
<br />
The archival documents themselves represent the first portion of a larger collection of scanned letters, invoices, inventories, notes, hoard photographs, and other research materials related to <i>The Inventory of Greek Coin Hoards</i> and subsequent <i>Coin Hoards</i> volumes. <i>Coin Hoards</i> will be published online in the near future, after we migrate the old IGCH platform into a completely new database system that operated more like <a href="http://numismatics.org/chrr">Coin Hoards of the Roman Republic</a>.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqZJb2EBtKlr-o5UmRSDa-80n5scAzuJagWKmgyGvqArMt0TIyWR9PwbFlMx_ybV2aozAQf_Z63wYu0gku0MinFP0XdAuN9Z3wBU5euZaZXFW63N-5jSP7Y4lKz40Vhyphenhyphenux5PSkDsi9MsY/s1600/Screenshot+from+2020-03-04+14-23-28.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="974" data-original-width="1600" height="242" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqZJb2EBtKlr-o5UmRSDa-80n5scAzuJagWKmgyGvqArMt0TIyWR9PwbFlMx_ybV2aozAQf_Z63wYu0gku0MinFP0XdAuN9Z3wBU5euZaZXFW63N-5jSP7Y4lKz40Vhyphenhyphenux5PSkDsi9MsY/s400/Screenshot+from+2020-03-04+14-23-28.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The display of <a href="http://coinhoards.org/id/igch0140">IGCH 140</a>, with new archival documents</td></tr>
</tbody></table>
<br />
Under the hood, these archival records are TEI documents generated from spreadsheet metadata entered by Peter van Alfen. The images are IIIF-compliant and follow the procedures we have already established with <a href="http://eaditor.blogspot.com/2019/01/updates-to-iiif-image-annotation-in.html">Edward T. Newell's research notebooks</a>. The Archer framework, <a href="https://github.com/ewg118/eaditor">EADitor</a>, was updated to accommodate other types of archival materials represented as TEI (manuscripts, etc.), and EADitor is capable of serializing these files directly into RDF for Archer's SPARQL endpoint (that drives the interconnectivity between the authority records and archival items, as well as the display of archival items in <a href="http://numismatics.org/search/">MANTIS</a> and IGCH). Additionally, the TEI files, and TEI-encoded annotations, are serialized dynamically into IIIF manifests.<br />
<br />
Because all TEI files use the same annotation system in the back-end of EADitor (Masahide Kanzaki's Image Annotator: <a href="https://www.kanzaki.com/works/2016/pub/image-annotator">https://www.kanzaki.com/works/2016/pub/image-annotator</a>), these new archival documents can be annotated with URIs from <a href="http://nomisma.org/">Nomisma.org</a>, coins in our collection, coin types or monograms in <a href="http://numismatics.org/pella">PELLA</a> or other corpora. As a proof of concept, I annotated the names of Mithradates VI and Lysimachus with their respective Nomisma URIs on the notes of Wayte Raymond about IGCH 973: <a href="http://numismatics.org/archives/ark:/53695/igch973.001">http://numismatics.org/archives/ark:/53695/igch973.001</a>. These annotations, stored natively in TEI surface elements within a facsimile, are serialized into JSON-LD according to the IIIF spec in real time, and displayed at the link above in Mirador. The names are also listed in the index below the Mirador viewer.<br />
<br />
While we still have more metadata to enter for more archival documents, the data-entry workflow and processing scripts are fully established at this stage. This is the next step in transforming the IGCH database into a more comprehensive research platform for Greek coin hoards.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-73658113792280087172019-07-09T08:29:00.003-07:002019-07-09T08:29:27.083-07:00135 ANS authority records merged into SNAC<div dir="ltr" style="text-align: left;" trbidi="on">
Finally, after fine-tuning the <a href="http://eaditor.blogspot.com/2018/07/creating-and-updating-snac.html">xEAC-to-SNAC publication workflow</a> over the last few months after initially building this functionality into xEAC last summer, I have switched over to the SNAC production API. We have integrated authority data from 135 EAC-CPF records in the <a href="http://numismatics.org/authorities/">American Numismatic Society Biographies</a> into the <a href="https://snaccooperative.org/">Social Networks and Archival Context</a> project. Among these authority records are dozens of new ones inserted into SNAC, complete with biographical information and references to <a href="http://numismatics.org/archives/">digital archival</a> and <a href="http://numismatics.org/digitallibrary/">library</a> holdings at the ANS. One of the more notable additions to SNAC is <a href="http://n2t.net/ark:/99166/w6hj7856">Margaret Thompson</a>, one of the most prominent Greek numismatists of the latter 20th century and a long-time curator at the ANS.<br />
<br />
Not only have we provided a comprehensive biography of Margaret Thompson, but also URIs in other systems, such as VIAF and Wikidata. The Bibliographic Resources for Thompson include numerous archival photographs (which link back to the ANS Archives--many of these are available in IIIF) and four ebooks in our Open Access Digital Library. These ebooks were digitized as part of the NEH-Mellon Foundation Open Humanities Book program.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit_uOK93EwpcCiJSw405K2JsUzNCBWJquJGhIjla3m7hsCSuVflhrVMu-BRC16NbdWKUSsW_gSEdnKMKjvckGu8oFwGUYOUU87q1vnMqPfGQvyn1OJ3oeY19_8VL5yUMeexH-qYPRFHYA/s1600/Screenshot+from+2019-07-09+11-06-58.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="546" data-original-width="1151" height="188" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit_uOK93EwpcCiJSw405K2JsUzNCBWJquJGhIjla3m7hsCSuVflhrVMu-BRC16NbdWKUSsW_gSEdnKMKjvckGu8oFwGUYOUU87q1vnMqPfGQvyn1OJ3oeY19_8VL5yUMeexH-qYPRFHYA/s400/Screenshot+from+2019-07-09+11-06-58.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">SNAC record for <a href="http://n2t.net/ark:/99166/w6x63rt8">Edward T. Newell</a>, with biography from the ANS.</td></tr>
</tbody></table>
<br />
In fact, since many of the ~200 books digitized as part of this NEH-Mellon project were authored by prominent numismatists represented in the ANS archival authorities, 74 of these books have been made accessible to scholars through SNAC. This was the aim of our initial application to this grant program--finally realized by much work in extending xEAC to be able to interact with SNAC's JSON APIs. We not only wanted to create a large corpus of TEI ebooks that linked to URIs in our numismatic collection or research databases like <a href="http://numismatics.org/ocre">Online Coins of the Roman Empire</a> and the <a href="http://coinhoards.org/">Inventory of Greek Coin Hoards</a> (and similar systems), but to integrate these books into the larger cloud of cultural heritage data by linking the authors to large-scale authority systems like SNAC that could be leveraged to point researchers back to our own services.<br />
<br />
SNAC was funded not only by Mellon (like our ebooks project), but also initially by the IMLS and the NEH. In this way, we are providing value to funders by building upon projects in which they have already invested: creating a whole that is greater than the sum of its parts. I hope that other institutions will look at <a href="https://github.com/ewg118/xEAC">xEAC</a> and our broader archival LOD strategy (see <a href="https://doi.org/10.5281/zenodo.1484529">Linked Open Data and Hellenistic Numismatics</a> and <a href="https://doi.org/10.5281/zenodo.1304270">Linked Open Data for Numismatic Library, Archive, and Museum Integration</a> for further information about this architecture) as a means by which they too can enhance SNAC while simultaneously broadening access to their own materials.<br />
<br />
By incorporating our archival authorities and digital archives and library into SNAC, we are providing pathways through broader, more generalized aggregators for non-numismatic researchers who may otherwise never think to query our archives directly. A great example of this is the record for the prominent sculptor, <a href="http://n2t.net/ark:/99166/w6m907r8">Augustus Saint-Gaudens</a>. This record links to more than 160 finding aids published by dozens of institutions, including museum archives, and so art historians may find correspondences in our archives as well as the Smithsonian Archives of American Art or the New York Public Library. Furthermore, since we have already used the Wikidata API look-up inherent to xEAC to embed related authority URIs in our own EAC-CPF record, we inserted the Getty ULAN URI for Saint-Gaudens into SNAC. This would, in theory, make it possible for SNAC to interact with art historical aggregators built on the Getty vocabularies to extract other works of cultural heritage, such as medals held at the American Numismatic Society or sculptures held in other art museums both in the United States and abroad. <br />
<br />
I think we are only seeing the tip of the iceberg of what will be possible interacting with SNAC.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-53530842510302614172019-01-10T08:50:00.001-08:002019-01-10T08:50:16.085-08:00Updates to IIIF image annotation in the EADitor back-end<div dir="ltr" style="text-align: left;" trbidi="on">
The American Numismatic Society's archival images were migrated into IIIF in the fall of 2017, including the extension of <a href="https://github.com/ewg118/eaditor">EADitor</a> to faciliate the creation of manifests from TEI files that represent the <a href="http://eaditor.blogspot.com/2017/10/newell-notebooks-migrated-to-iiif.html">Newell notebooks</a>. While the front end was updated to use Leaflet for single photographs (MODS records) or Mirador for image collections, like the <a href="http://numismatics.org/archives/ark:/53695/nnan187715">notebooks</a> or the <a href="http://numismatics.org/mirador/?manifest=http://numismatics.org/archives/manifest/nnan0037">Agnes Balwin Brett papers</a>, the back-end had not been updated to enable the editing or creation of new annotations.<br />
<br />
After the back-to-back releases of the full <a href="http://numismatics.org/sco/">Seleucid Coins Online</a> and the first phase of <a href="http://numismatics.org/pco/">Ptolemaic Coins Online</a> in December, I have been able to pivot completely from coin type corpora and data cleaning to working on our digital archives for a brief period. After fixing some bugs, I turned my attention to piecing the image annotation back together in the XForms engine for TEI editing/publication within <a href="http://numismatics.org/archives/">Archer</a>. The original system was developed in 2014. <a href="http://eaditor.blogspot.com/2014/06/first-newell-notebook-published-in.html">This</a> blog post covers most of the technical underpinnings, but to summarize: Rainer Simon's <a href="https://annotorious.github.io/">Annotorious</a> was hooked into OpenLayers to facilitate image annotation. The create/remove/update handlers in Annotorious were used to round trip the annotations to/from TEI surface elements within tei:facsimiles and Annotorious' JSON model in the XForms engine (using the client-side <a href="https://doc.orbeon.com/xforms/core/client-side-javascript-api">Javascript hooks</a> in Orbeon). There have been significant updates to Orbeon since 2014, and my original code was somewhat broken, and therefore I needed to explore alternative solutions.<br />
<br />
My first attempt was loading a manifest for a Newell notebook into Mirador in the XForms web form. Although Mirador did load the manifest, due to of some unforeseen conflicts between the Javascript in Orbeon and Mirador, the annotation popups (with the TinyMCE library) didn't function correctly. I then began to explore Masahide Kanzaki's <a href="https://www.kanzaki.com/works/2016/pub/image-annotator">Image Annotator</a>. This was appealing, as I had tested this application's ability to show two images on the same canvas in dynamically SPARQL-generated IIIF manifests from <a href="https://github.com/ewg118/numishare">Numishare</a>-based type corpora (see this example of <a href="http://www.kanzaki.com/works/2016/pub/image-annotator?u=http%3A%2F%2Fnumismatics.org%2Fcrro%2Fmanifest%2Frrc-15.1a">RRC 15/1a</a> that combines IIIF images from three different museums into one manifest--one canvas per coin and two images per canvas). The Image Annotator not only loads IIIF manifests into OpenSeaDragon, but was extended to support Annotorious for creating and viewing annotations.<br />
<br />
After several days of work, I have been able to fully reactivate image annotation in the EADitor back-end with the Image Annotator. It took a little bit of reverse engineering in order to find the functions for the handlers, with some slight modifications to my original code to hook the Annotorious handlers into the XForms engine. This included some changes in the mathematical calculations for converting the ratio-based coordinates to pixels for the TEI surface's upper-left x,y and lower-right x,y attributions. These TEI attributes are serialized into proper #xywh fragments in the Web Annotations in the manifest.<br />
<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOdlMvvak7vDjA-QbPMG-FueD36EXRyGFiMRn7omZ3iqVXx2yof0KvP2bafjt1lcCp1gk4HPNqO-UcOMKwM7ip6NPmr0BuI19lhWWL6CZyV56aFfZ97MEpDIO_4ZMhgIhIaAOHI5B1zRQ/s1600/Screenshot+from+2019-01-10+11-30-08.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="984" data-original-width="1600" height="245" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOdlMvvak7vDjA-QbPMG-FueD36EXRyGFiMRn7omZ3iqVXx2yof0KvP2bafjt1lcCp1gk4HPNqO-UcOMKwM7ip6NPmr0BuI19lhWWL6CZyV56aFfZ97MEpDIO_4ZMhgIhIaAOHI5B1zRQ/s400/Screenshot+from+2019-01-10+11-30-08.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 1: Image Annotator in the XForms engine</td></tr>
</tbody></table>
<br />
I also had to track down and comment out some components of the UI (like the document metadata and links) and tweak the CSS so that the OpenSeaDragon window fit within the parameters of my existing Bootstrap 3.x template.<br />
<br />
URIs in certain namespaces are still parsed to extract human-readable labels (see Fig. 1 and 2), for example, from the <a href="http://numismatics.org/search">ANS collection</a>. My intention is to extend the range of parseable URIs to include Wikidata, other URIs in the ANS digital library or archives, <a href="http://snaccooperative.org/">Social Networks and Archival Context</a>, <a href="https://www.oclc.org/developer/develop/linked-data/worldcat-entities/worldcat-work-entity.en.html">Worldcat Works</a>, and, eventually, URIs for Hellenistic monograms. I might even extend the parsing to extract thumbnail images for coins and store those in the tei:desc within the TEI document (in addition to simple mixed content w/ tei:ref elements as external links).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6FJUWOzl9-29MrUqhqAp37xFtKcrG_8PiisKHYHaVu18e_w_fi5Ob6t-6047BUc-h3lIUIH9AdA3-dyy4QVKSAd8E2rRnD_a86d388adny9J7xWSlnX4jr7YHFaa82JHDi0164dtIWGg/s1600/Screenshot+from+2019-01-10+11-30-24.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="366" data-original-width="505" height="288" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg6FJUWOzl9-29MrUqhqAp37xFtKcrG_8PiisKHYHaVu18e_w_fi5Ob6t-6047BUc-h3lIUIH9AdA3-dyy4QVKSAd8E2rRnD_a86d388adny9J7xWSlnX4jr7YHFaa82JHDi0164dtIWGg/s400/Screenshot+from+2019-01-10+11-30-24.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fig. 2: After clicking 'Save', the URI is replaced with an HTML link</td></tr>
</tbody></table>
<br />
After the reworking of the IGCH data over the next several months, we will turn our attention to annotating more of Edward T. Newell's notebooks as part of the NEH-funded <a href="http://numismatics.org/neh-hrc2017/">Hellenistic Royal Coinages</a> (HRC) project. The UI provided by the Image Annotator is much easier to work with than the one I had developed more directly within XForms nearly five years ago, and so we should see some significant progress toward annotation these notebooks to link to coins in our (or other) numismatic collections, coin types in HRC, Greek coin hoards, and our yet-to-be-published database of Greek monograms. And these annotations will enhance research context in our other platforms by pointing users back to individual notebook pages in Archer from Mantis or IGCH (for example, from <a href="http://coinhoards.org/id/igch1664">http://coinhoards.org/id/igch1664</a> or <a href="http://numismatics.org/collection/1944.100.26870">http://numismatics.org/collection/1944.100.26870</a>).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjz0JCSU7T85MMkZ58e0o01NobCAq_NNvEwhOP_AHkHsdb6wZkqITNGmudrUSFrVD4xs9jJ6AEHQW6HjbqvHIm7FvNSb_cTuWngt-P_CeobZxedsb6JZTU6n8wfE2IThU_tlobuZ69_tQ0/s1600/Screenshot+from+2019-01-10+11-48-31.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1187" data-original-width="1600" height="296" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjz0JCSU7T85MMkZ58e0o01NobCAq_NNvEwhOP_AHkHsdb6wZkqITNGmudrUSFrVD4xs9jJ6AEHQW6HjbqvHIm7FvNSb_cTuWngt-P_CeobZxedsb6JZTU6n8wfE2IThU_tlobuZ69_tQ0/s400/Screenshot+from+2019-01-10+11-48-31.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><a href="http://eaditor.blogspot.com/2016/03/toward-more-thoroughly-integrated.html">SPARQL-generated</a> list of Open Annotations related to IGCH 1664</td></tr>
</tbody></table>
<br /></div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com1tag:blogger.com,1999:blog-3664503123885891673.post-38094850840595350572018-11-15T09:49:00.001-08:002018-11-15T09:49:17.876-08:00An American Europeana<div dir="ltr" style="text-align: left;" trbidi="on">
The blog is often reserved for updates or technical explanations of archival/authority software development at the <a href="http://numismatics.org/">American Numismatic Society</a>, or experimentation in new modes of archival data publication (mainly Linked Open Data).<br />
<br />
However, since I have long been a proponent of open, community-oriented efforts to publish cultural heritage aggregations, like <a href="https://www.europeana.eu/portal/en">Europeana</a> and <a href="https://dp.la/">DPLA</a>, I wanted to take a bit of time to hash out some thoughts in the form of a blog post instead of starting a series of disjointed Twitter threads [<a href="https://twitter.com/ewg118/status/1061443389848281089">1</a>, <a href="https://twitter.com/ewg118/status/1062822107556597761">2</a>].<br />
<br />
Most of you have likely heard that DPLA laid off six employees, and John S. Bracken went <a href="https://twitter.com/ALA_LITA/status/1061358390293733376">online</a> to speak of his vision and answer some questions. This vision seems to revolve around ebook deals primarily, with cultural heritage aggregation as a secondary function of DPLA. However, DPLA laid off the people that actually know how to do that stuff, so the aggregation aspect of the organization (which is its real and lasting value to the American people) no longer seems viable.<br />
<br />
I believe the ultimate solution for an American version of Europeana is tying it into the institutional function of a federally-funded organization like the Library of Congress or Smithsonian, with the backing of Congressional support for the benefit of the American people (which is years away, at least). However, I do think there are some shorter-term solutions that can be undertaken to bootstrap an aggregation system and administered by one organization or a small body of institutions working collaboratively. There doesn't need to be a non-profit organization in the middle to manage this system, at least at this phase. <br />
<br />
There are a few things to point out regarding the system's political and technical organization:<br />
<br />
<ol style="text-align: left;">
<li>The real heavy lifting is done by the service/content hubs. It takes more time/money/professional expertise to harvest and normalize the data than it does to build the UI on top of good quality data.</li>
<li>Much of the aggregation software has been written already, but hasn't been shared broadly with the community.</li>
<li>There seems to be a wide variation in the granularity and quality of data provided to DPLA. I wrote a <a href="https://github.com/Orbis-Cascade-Alliance/harvester">harvester</a> for <a href="https://www.orbiscascade.org/alliance-harvester/">Orbis Cascade</a> that provided them with DPLA Metadata Application Profile-compliant RDF that had some normalization of strings extracted from Dublin Core to Getty AAT and VIAF URIs, which were modeled properly into SKOS Concepts or EDM Agents. But DPLA couldn't actually ingest their own data model.</li>
<li>Europeana has already written a ton of tools that can be repurposed. </li>
<li>There are other off the shelf tools that scale that could be appropriated for either the UI or underlying architecture (Blacklight, various open source triplestores, like Apache Fuseki, which I have heard will scale at least to a billion triples).</li>
<li>On a non-technical level, the name "Digital Public Library of America" itself is problematic, because the project has been overwhelmingly driven by R1 research libraries. Cultural Heritage is more than what you find in a Special Collections Library, and museums are notably absent from this picture (in contrast to Europeana).</li>
</ol>
<br />
Without knowing more of the details, I had heard that DPLA had scaling issues with their SPARQL endpoint software. I don't know if this is still an issue with this particular software, but I do believe the data were a problem. Aside from what was produced by those organizations that are part of Orbis Cascade that opted to reconcile their strings to things (sadly, most did not choose to take this additional step), how much data ingested by DPLA is actual, honest to God Linked Open Data--with, you know, links? A giant triplestore that's nothing but literals is not very useful, and it's impossible to build UIs for the public that can live up to the potential of the data and the architectural principles of LOD.<br />
<br />
At some point, there needs to be a minimum data quality barrier to entry into DPLA, and part of this is implementing a required layer of reconciliation of entities to authoritative URIs. I understand this does create more work for individual organizations that wish to participate, but the payoffs are immense:<br />
<br />
<ol style="text-align: left;">
<li>Reconciliation is a two way street: it enables you to extract data from external sources to enhance your own public-facing user interface (biographies about people--that sort of thing).</li>
<li><a href="http://snaccooperative.org/">Social Networks and Archival Context</a> should play a vital role in the reconciliation of people, families, and corporate bodies. There should be greater emphasis in the LibTech community to interoperate with SNAC in order to create entities that only exist in local authority files, which will then enable all CPF entities to be normalized to SNAC URIs upon DPLA ingestion.</li>
<ul>
<li>Furthermore, SNAC itself can interact with DPLA APIs in order to populate a more complete listing of cultural heritage objects related to that entity. Therefore, there is an immediate benefit to contributors to DPLA, as their content will simultaneously become available in SNAC to a wide range of researchers and genealogists via LOD methodologies.</li>
<li>SNAC is beginning to aggregate content about entities, so it frankly doesn't make sense for there to be two architecturally dissimilar systems that have the same function. DPLA and SNAC should be brought closer together. They <i>need </i>each other in order for both projects to maximize their potential. I <b>strongly</b> believe these projects are inseparable.</li>
</ul>
<li>With regard to the first two points, content hubs should put greater emphasis on building the reconciliation services for non-technical libraries, archivists, curators, etc. to use, with intuitive user interfaces that allow for efficient clean-up. Many people (including myself) have already built systems that look up entities in Geonames, VIAF, SNAC, the Getty AAT/ULAN, Wikidata, etc. This work doesn't need to be done from scratch.</li>
</ol>
Because DPLA's data are so simple and unrefined, many of the lowest hanging fruits in a digital collection interfaces have not been achieved, such as basic geographic visualization. Furthermore facet fields are basically useless because there's no controlled vocabulary.<br />
<br />
After expanding the location facet for a basic text search of Austin, I am seeing lists that appear to be Library of Congress-formatted geographic subject headings. The most common heading is "United States - Texas - Travis County - Austin", mainly from the <span class="ListView__itemProvider___2BlBb">Austin History Center, Austin Public Library. However, there are many more variations of the place name contributed by other organizations.</span><br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBvlQyAQ7be_sxSPT7hORFAF4naJDJMz9Hygz408me67_xUiYNiDT_5FfDZ98GKFJiuNxwy-_bSDaYupID7c1Banv7XW1ObtZdBS-cXPbGr8DXnJxKaR36hnBo1am6L-BAWq5ItXeTLyQ/s1600/Screenshot+from+2018-11-15+12-15-22.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="347" data-original-width="426" height="260" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBvlQyAQ7be_sxSPT7hORFAF4naJDJMz9Hygz408me67_xUiYNiDT_5FfDZ98GKFJiuNxwy-_bSDaYupID7c1Banv7XW1ObtZdBS-cXPbGr8DXnJxKaR36hnBo1am6L-BAWq5ItXeTLyQ/s320/Screenshot+from+2018-11-15+12-15-22.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The many Austins</td></tr>
</tbody></table>
<span class="ListView__itemProvider___2BlBb"></span><br />
<span class="ListView__itemProvider___2BlBb">This is really a problem that needs to be addressed further down the chain from DPLA at the hub level. If you want to build a national aggregation system that reaches its full potential, more emphasis needs to be placed on data normalization. </span><br />
<span class="ListView__itemProvider___2BlBb"></span><br />
<span class="ListView__itemProvider___2BlBb"></span><br />
<span class="ListView__itemProvider___2BlBb"></span><br />
<span class="ListView__itemProvider___2BlBb"><br /></span>
<span class="ListView__itemProvider___2BlBb">DPLA decided to go large scale, low quality. I am much more of a small scale, good quality person, because it is easier to scale up later once you have the workflows to produce good quality data than it is to go back and clean up a pile of poor data.</span><span class="ListView__itemProvider___2BlBb"> And I don't think that the
current form of the DPLA interface is powerful enough to demonstrate the
value of entity reconciliation to the librarians, curators, etc. making
the most substantial investment of time. You can't get the buy-in from that specialist community without demonstrating a powerful user interface that capitalizes on the effort they have made. I know this from experience. <a href="http://nomisma.org/">Nomisma.org</a> struggled to get buy-in until we built <a href="http://numismatics.org/ocre">Online Coins of the Roman Empire</a>, and now Nomisma is considered one of the most successful LOD projects out there.</span><br />
<span class="ListView__itemProvider___2BlBb"><br /></span>
<span class="ListView__itemProvider___2BlBb">My recommendation is to go back to the drawing board with a small number of data contributors to develop the workflows that are necessary to build a better aggregation system. This process should be completely transparent and can be replicated within the other content hubs. The burden of cleaning data shouldn't fall on the shoulders of DPLA (or whoever comes next).</span><br />
<span class="ListView__itemProvider___2BlBb"><br /></span>
<span class="ListView__itemProvider___2BlBb">There are obvious funding issues here, but contributions of staff time and expertise can be more valuable than monetary contributions in this case.</span></div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-71426007441911385812018-07-11T07:23:00.000-07:002018-07-11T07:27:52.426-07:00Creating and Updating SNAC constellations directly in xEAC<div dir="ltr" style="text-align: left;" trbidi="on">
After 2-3 weeks of work, I have made some very significant updates to <a href="https://github.com/ewg118/xEAC">xEAC</a>, one which paves the way to making archival materials at the <a href="http://numismatics.org/archives/">American Numismatic Society</a> (and other potential users of our open source software frameworks) broadly accessible to other researchers. This is especially important for us, since we are a small archive with unique materials that don't reach a general historical audience, and we are now able to fulfill one of the potentialities we outlined in our Mellon-NEH Open Humanities Book project: that we would be able to make <a href="http://numismatics.org/digitallibrary/results?q=genre_facet:%22e-books%22">200+ open ebooks</a> available through <a href="http://snaccooperative.org/">Social Networks and Archival Context</a> (SNAC).<br />
<br />
I have introduced a new feature that interacts with the SNAC JSON API within the XForms backend of xEAC (note that you need to use an XForms 2.0 compliant processor for xEAC in order to make use of JSON data). The feature will create a new constellation if none exists or supplement existing constellations with data from the local EAC-CPF record. While the full range of EAC-CPF components is supported by the SNAC API, I have focused primarily on the integration of the stable URI for the entity in the local authority system (e.g., <a href="http://numismatics.org/authority/newell">http://numismatics.org/authority/newell</a>), existDates (if they are not already in the constellation), and the biogHist. Importantly, if xEAC users have opted to connect to a <a href="http://eaditor.blogspot.com/2014/06/ans-archives-v2-has-gone-live-ead-eac.html">SPARQL endpoint</a> that also contains archival or libraries materials, these related resources will be created in SNAC and linked to the constellation.<br />
<br />
It should be noted that this system is still in beta and has only been tested with the SNAC development server. There is still work to do with improving the authentication handshake between xEAC and SNAC.<br />
<br />
<h3 style="text-align: left;">
The process</h3>
<h3 style="text-align: left;">
</h3>
<h4 style="text-align: left;">
Step 1: Reviewing an existing constellation for content</h4>
<br />
The first step of the process is executed when the user loads the form. If the EAC-CPF record already contains an entityId that conforms to the permanent, stable SNAC ARK URI, a "read" query will be issued to the SNAC API in order to determine what content already exists in the constellation, including what resources are already available in the constellation vs. the resources extracted from the local archival information system via SPARQL.<br />
<br />
The SPARQL query for extracted resources from the endpoint is as follows:<br />
<br />
<code>
</code>
<br />
<pre><code>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/></code></pre>
<pre><code> </code></pre>
<pre><code>SELECT ?uri ?role ?title ?type ?genre ?abstract ?extent WHERE {
?uri ?role <http://numismatics.org/authority/newell> ;</code></pre>
<pre><code><code> </code>dcterms:title ?title ;
rdf:type ?type ;
dcterms:type ?genre .
OPTIONAL {?uri dcterms:abstract ?abstract}
OPTIONAL {?uri dcterms:extent ?extent}
} ORDER BY ASC(?role)
</code></pre>
<code>
</code>
<br />
<br />
I recently made an update to our Digital Library and Archival software so that every different type of resource (ebooks and notebooks in TEI, photographs in MODS, finding aids in EAD) will include a dcterms:type linking to a <a href="http://vocab.getty.edu/aat/">Getty AAT</a> URI in the RDF serialization. This AAT URI, in conjunction with the rdf:type of the archival or library object (often a schema.org Class), will help determine the type of resource according to SNAC's own parameters (BibliographicResource, ArchivalResource, DigitalArchivalRescource). Additionally, the role of the entity with respect to the resource (dcterms:creator, dcterms:subject) informs the role within the SNAC resource-constellation connection: creatorOf, referencedIn. Abstracts and extents are inserted, if available.<br />
<br />
<h4 style="text-align: left;">
Step 2: Validate authentication</h4>
<br />
SNAC uses Google user tokens for validation within its own system. There is currently no handshake available between xEAC and SNAC which will facilitate multiple users in xEAC to each have their own credentials in SNAC. At the moment, the "user" information is stored in the xEAC config file. A user will have to enter their Google credentials from the SNAC API Key page into the web form and click the "Confirm User Data" button. xEAC will submit an "edit" to a random constellation to verify the validity of the authentication information. If it is successful, the credentials are then stored back into the config (although the token only lasts about 24 hours) and the constellation is immediately unlocked. The user will then proceed to the create/update constellation interface.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSvoubkZj2Nm3z_NXfGgP97JDd4tGWSnXnKEreMSHwLY7EDOUgmG6CJVRnOMeZ-PWNb_KSgidvMIfizY4yuAQOKrhVjEF9_uE_V3dDapyw-0vw0QAh8cJgCpXYtlUNqvBOZATmU-1KOoY/s1600/Screenshot+from+2018-07-11+09-02-30.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="655" data-original-width="1600" height="163" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSvoubkZj2Nm3z_NXfGgP97JDd4tGWSnXnKEreMSHwLY7EDOUgmG6CJVRnOMeZ-PWNb_KSgidvMIfizY4yuAQOKrhVjEF9_uE_V3dDapyw-0vw0QAh8cJgCpXYtlUNqvBOZATmU-1KOoY/s400/Screenshot+from+2018-07-11+09-02-30.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Authenticating through xEAC</td><td class="tr-caption" style="text-align: center;"><br /></td></tr>
</tbody></table>
<h4 style="text-align: left;">
</h4>
<h4 style="text-align: left;">
Step 3: Creating or updating a constellation</h4>
<br />
The user will now see several checkboxes to add information into the constellation. Eventually, it will be possible to remove data as well. Below is a synopsis of options:<br />
<br />
<ol style="text-align: left;">
<li>Same As URI: The URI of the entity in the local authority system will be added into the constellation. This is especially important or establishing concordances between different vocabulary systems.</li>
<li>Exist dates can be added into the constellation if they are not already present.</li>
<li>If there isn't already a biogHist in the constellation and there is one present in the EAC-CPF record, the biogHist will be escaped and published to SNAC. A source will also be created in the constellation in order to link the new biogHist to SNAC control metadata, tying the new biogHist directly to the local URI for the authority. This makes it possible to update or delete only the biogHist associated with your own entity without overwriting other biogHist information that might already be present within the constellation. While SNAC does support multiple biogHists, only the most recently added biogHist will appear in the HTML view of the entity. For this reason (at present), xEAC will only insert a biogHist if there isn't one in the constellation already. In step 1, if the constellation already contains a biogHist associated with the source URI for your authority, it will hash encode the constellation's biogHist and compare it to the hash-encoded biogHist currently in the EAC-CPF record. If there is a difference between these hashes, the constellation will be updated with the current version of the biogHist in the EAC-CPF record.</li>
<li>A list of resource relations derived from SPARQL will be displayed. All will be checked by default in order to first create the resource with the "insert_resource" API command, and second to connect the constellation to that newly created resource with "update_constellation". Each resource entry will display some basic metadata and whether or not it already exists in the constellation, and what action will be taken. It is possible to uncheck the box for a resource that exists in the constellation to remove it from the constellation.</li>
</ol>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNv63o7SNNJcKIypziHHSyM8Ht503hZx5EdyPqUZLlxO_GVZNlBXKQLcatuIrntgBU2_rVS2zdBVqgBMz2gBcSYmsRd_H5vZr9xD-lzDCILzutt8jZA2oRLLVDvvBD89Nx_Kn0k4Qv3wc/s1600/Screenshot+from+2018-07-11+09-43-48.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1195" data-original-width="1600" height="298" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNv63o7SNNJcKIypziHHSyM8Ht503hZx5EdyPqUZLlxO_GVZNlBXKQLcatuIrntgBU2_rVS2zdBVqgBMz2gBcSYmsRd_H5vZr9xD-lzDCILzutt8jZA2oRLLVDvvBD89Nx_Kn0k4Qv3wc/s400/Screenshot+from+2018-07-11+09-43-48.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The interface for creating and updating SNAC constellations</td><td class="tr-caption" style="text-align: center;"><br /></td><td class="tr-caption" style="text-align: center;"><br /></td></tr>
</tbody></table>
<h4 style="text-align: left;">
Step 4: Saving the ARK back to the EAC-CPF record, if applicable</h4>
After the successful issuing of "publish_constellation" to the SNAC API, an entityId with the new SNAC ARK URI will be inserted into the EAC-CPF record, if the constellation is newly created (updates presume the ARK already exists in the EAC record). Saving the EAC record will trigger a re-indexing of the document to Solr and a SPARQL/Update that will insert the ARK as a skos:exactMatch into the concept object for the entity.<br />
<pre><code>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
INSERT { ?concept skos:exactMatch <ARK> }
WHERE { ?concept foaf:focus <URI> }
</code>
</pre>
<br />
The data above are those I consider to most vital to SNAC integration--essential historical or biographical context and related archival or library resources that can be made more broadly accessible. I am not sure how many other authority systems are able to interact with SNAC with this degree of granularity yet, but I am hopeful that these features will propel more unique research materials into the public sphere.<br />
<br />
I will briefly touch on these new features when I present our our comprehensive LOD-oriented numismatic research platform at SAA next month (I will upload the slideshow soon).</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-58551372742735239902018-06-07T13:36:00.000-07:002018-06-07T13:41:02.171-07:00SNAC Lookups Updated in xEAC and EADitor<div dir="ltr" style="text-align: left;" trbidi="on">
Since the Social Networks and Archival Context has migrated to a new platform, it has published a <a href="http://snaccooperative.org/api_help">JSON-based REST API</a>, which they have well-documented. Although <a href="https://github.com/ewg118/eaditor">EADitor</a> and <a href="https://github.com/ewg118/xEAC">xEAC</a> have had lookup mechanisms to link personal, corporate, and family entities from SNAC to EAD and EAC-CPF records since 2014 (see <a href="http://eaditor.blogspot.com/2014/08/extended-linked-data-controlled.html">here</a>), the lookup mechanisms in the XForms-based backends to these platforms interacted with an unpublicized web service that provided an XML response for simple queries.<br />
<br />
With the advent of these new SNAC APIs and JSON processing within the XForms 2.0 spec (present in <a href="http://www.orbeon.com/">Orbeon</a> since 2016), I have finally gotten around to overhauling the lookups in both EADitor and xEAC. Following documentation for the Search API, the XForms Submission process now submits (via PUT) an instance that conforms to the required JSON model. The @serialization attribute is set to "application/json" in the submission, and the JSON response from SNAC is serialized back into XML following the <a href="https://www.w3.org/community/xformsusers/wiki/XForms_2.0#External_JSON_values">XForms 2.0 specification</a>. Side note: the JSON->XML serialization differs between XForms 2.0 and XSLT/XPath 3.0, and so there should be more communication between these groups to standardize JSON->XML across all XML technologies.<br />
<br />
The following XML instance is transformed into API-compliant JSON upon submission. <br />
<br />
<code></code><br />
<pre><code><xforms:instance id="query-json" exclude-result-prefixed="#all">
<json type="object" xmlns="">
<command>search</command>
<term/>
<entity_type/>
<start>0</start>
<count>10</count>
</json>
</xforms:instance></code></pre>
<br />
<br />
The submission is as follows:<br />
<br />
<code></code><br />
<pre><code><xforms:submission id="query-snac" ref="instance('query-json')"
action="http://api.snaccooperative.org" method="put" replace="instance"
instance="snac-response" serialization="application/json">
<xforms:header>
<xforms:name>User-Agent</xforms:name>
<xforms:value>XForms/xEAC</xforms:value>
</xforms:header>
<xforms:message ev:event="xforms-submit-error" level="modal">Error transfroming
into JSON and/or interacting with the SNAC
API.</xforms:message>
</xforms:submission> </code></pre>
<br />
The SNAC URIs are placed into the entityIds within the cpfDescription/identity in EAC-CPF or as the @authfilenumber for a persname, corpname, or famname in EAD.<br />
<br />
The next task to to build APIs into xEAC for pushing data (biographical data, skos:exactMatch URIs, and related archival resources) directly into SNAC. By tomorrow, all (or nearly all) of the <a href="http://numismatics.org/authorities/">authorities</a> in the ANS Archives will be linked to SNAC URIs.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com4tag:blogger.com,1999:blog-3664503123885891673.post-3460051882017288712018-05-18T07:58:00.000-07:002018-05-18T07:58:45.358-07:00Three new Edward Newell research notebooks added to Archer<div dir="ltr" style="text-align: left;" trbidi="on">
Three research notebooks of Edward T. Newell have been added to <a href="http://numismatics.org/archives/">Archer</a>, the archives of the American Numismatic Society. These had been scanned as part of the larger <a href="http://eaditor.blogspot.com/2014/06/first-newell-notebook-published-in.html">Newell digitization project</a>, which was <a href="http://eaditor.blogspot.com/2017/10/newell-notebooks-migrated-to-iiif.html">migrated into IIIF</a> for display in Mirador (with annotations) in late 2017.<br />
<br />
These three notebooks had been scanned, but TEI files had not been generated due to some minor oversight. Generating the TEI files was fairly straightforward--there's a small PHP script that will extract MODS from our Koha-based library catalog. These MODS files are subsequently run through an <a href="https://github.com/AmericanNumismaticSociety/migration_scripts/blob/master/newell/generate-tei.xsl">XSLT 3.0 stylesheet</a> to generate TEI with a facsimile listing of all image files associated with the notebook, linking to the IIIF service URI. XSLT 3.0 comes into play to parse the info.json for each image in order to insert the height and width of the source image directly into the TEI, which is used for the TEI->IIIF Manifest JSON transformation (the canvas and image portions of the manifest), which is now inherent to TEI files published in the <a href="https://github.com/ewg118/eaditor">EADitor</a> platform.<br />
<br />
The notebooks all share the same general theme: they are Newell's notes on the coins in the <a href="http://ikmk.smb.museum/home?lang=en">Berlin Münzkabinett</a>, which we aim to annotate in Mirador over the course of the NEH-funded <a href="http://numismatics.org/neh-hrc2017/">Hellenistic Royal Coinages</a> project.<br />
<br />
<ul style="text-align: left;">
<li><a href="http://numismatics.org/archives/ark:/53695/nnan189221">http://numismatics.org/archives/ark:/53695/nnan189221</a></li>
<li><a href="http://numismatics.org/archives/ark:/53695/nnan189222">http://numismatics.org/archives/ark:/53695/nnan189222</a></li>
<li><a href="http://numismatics.org/archives/ark:/53695/nnan189223">http://numismatics.org/archives/ark:/53695/nnan189223</a></li>
</ul>
A fourth notebook was found to have not yet been scanned, and so it will be published online soon. <br />
<ul style="text-align: left;">
</ul>
</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-59319529071625431992018-04-06T08:17:00.000-07:002018-04-06T08:17:00.734-07:00117 ANS ebooks published to Digital Library<div dir="ltr" style="text-align: left;" trbidi="on">
I have finally put the finishing touches on 117 ANS out-of-print publications that have been digitized into TEI (and made available as EPUB and PDF) as part of the NEH and Mellon-funded Open Humanities Book project. This is the "end" (more details on what an end entails later) of the project, in which about 200 American Numismatic Society monographs were digitized and made freely and openly available to the public.<br />
<br />
All of these, plus a selection of numismatic electronic theses and dissertations as well as two other ebooks not funded by the NEH-Mellon project, are available in the <a href="http://numismatics.org/digitallibrary/">ANS Digital Library</a>. The details of this project have been outlined in previous <a href="http://eaditor.blogspot.com/2017/01/more-than-80-lod-enhanced-ebooks.html">blog posts</a>, but to summarize, the TEI files have been annotated with thousands of links to people, places, and other types of entities defined in a variety of information systems--particularly Nomisma.org (for ancient entities), Wikidata, and Geonames (for modern ones).<br />
<br />
Additionally: <br />
<ul style="text-align: left;">
<li>Books have been linked to 153 coins (so far) in the ANS collection identified by accession number. Earlier books cite Newell's personal collection, bequeathed to the ANS and accessioned in 1944. A specialist will have to identify these.</li>
<li>173 total references to coin hoards defined in the <a href="http://coinhoards.org/">Inventory of Greek Coin Hoards</a>, plus several from Kris Lockyear's <a href="http://numismatics.org/chrr">Coin Hoards of the Roman Republic</a>.</li>
<li>166 references to Roman imperial coin types defined in the NEH-funded <a href="http://numismatics.org/ocre">Online Coins of the Roman Empire</a>.</li>
<li>A small handful of Islamic glass weights in The Metropolitan Museum of Art </li>
<li>One book by Wolfgang Fischer-Bossert, <i>Athenian Decadrachm</i>, has a <a href="http://dx.doi.org/10.26608/nnan146798">DOI</a>, connected to his ORCID.</li>
</ul>
Since each of these annotations is serialized into RDF and published in the ANS archival SPARQL endpoint, the other various information systems (<a href="http://numismatics.org/search">MANTIS</a>, IGCH, OCRE, etc.) query the endpoint for related archival or library materials.<br />
<br />
For example, the clipped shilling, <a href="http://numismatics.org/collection/1942.50.1">1942.50.1</a>, was minted in Boston, but the note says it was found among a mass of other clippings in London. The findspot is not geographically encoded in our database (and therefore doesn't appear on the map), but this coin is cited in "Part III Finds of American Coins Outside the Americas" in <i><a href="http://numismatics.org/digitallibrary/ark:/53695/nnan147765"><span style="font-weight: normal;">Numismatic finds of the Americas</span></a><span style="font-weight: normal;">.</span></i><br />
<i><span style="font-weight: normal;"><br /></span></i>
<br />
<h3 style="text-align: left;">
Using OpenRefine for Entity Reconciliation</h3>
Unlike the first phase of the project, the people and places tagged in these books were extracted into two enormous lists (20,000 total lines) that were reconciled against Wikidata, VIAF, or <a href="http://numishare.blogspot.com/2017/10/nomisma-launches-openrefine.html">Nomisma OpenRefine</a> reconciliation APIs. Nomisma was particularly useful because of the high degree of accuracy in matching people and places. Wikidata and VIAF were useful for modern people and places, but these were more challenging in that there might be dozens of American towns with the same name or numerous examples of Charles IV or other regents. I had to evaluate the name within the context of the passage in which it occurred, a tedious process that took nearly two months to complete. The end result, however, has a significantly broader and more accurate coverage than the 85 books in the first iteration of the grant. After painstakingly matching entities to their appropriate identifiers, it only took about a day to write the scripts to incorporate the URIs back into the TEI files, and a few more days of manual, or regex linking for IGCH, ANS coins, etc.<br />
<br />
As a result of this effort, and through the concordance between Nomisma identifiers and Pleiades places, there are a total of 3,602 distinct book sections containing 4,304 Pleiades URIs, which can now be made available to scholars through the <a href="http://commons.pelagios.org/">Pelagios project</a>.<br />
<br />
<br />
<h3 style="text-align: left;">
What's Next for ANS Publications?</h3>
So while the project concludes in its official capacity, there is room for improvement and further integration. Now that the corpus has been digitized, it will be possible to export all of the references into OpenRefine in an attempt to restructure the TEI and link to URIs defined by Worldcat. We will want to link to other DOIs if possible, and make the references for each book available in Crossref. Some of this relies on the expansion of Crossref itself to support entities identifiers beyond ORCID (e.g., ISNI) and citations for Worldcat. Presently, DOI citation mechanisms allow us to build a network graph of citations for works produced in the last few years, but the extension of this graph to include older journals and monographs will allow us to chart the evolution of scientific and humanistic thought over the course of centuries.<br />
<br />
As we know, there is never an "end" to Digital Humanities projects. Only constant improvement. And I believe that the work we have done will open the door to a sort of born-digital approach to future ANS publications.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-43342435887315911112017-10-31T11:57:00.002-07:002017-10-31T11:57:16.034-07:00EADitor now supports EAD and MODS to IIIF manifest generation<div dir="ltr" style="text-align: left;" trbidi="on">
After migrating the <a href="http://eaditor.blogspot.com/2017/10/newell-notebooks-migrated-to-iiif.html">Newell TEI notebooks</a> to support serialization of facsimiles into IIIF manifests and the render of these manifests in an embedded Mirador viewer, I implemented a transformation of EAD finding aid image collections and MODS records for photographs into manifests.<br />
<br />
<h3 style="text-align: left;">
EAD updates</h3>
The EAD finding aids were updated to replace the daogrp's linking to flickr images to link to thumbnail, reference, and IIIF service URLs (dao[@xlink:role='IIIFService']). An XSLT transformation of the EAD into manifest JSON occurs, with an intermediate process of iterating through the IIIFService info.json files with the Orbeon XForms processor in XPL to extract the height and width to generate canvases for each image.<br />
<br />
The <a href="http://numismatics.org/archives/ark:/53695/nnan0037">Brett finding aid</a> now includes clickable thumbnails that will launch the zoomable Leaflet viewer in a fancybox popup window. At the top of the page, the user can download the <a href="http://numismatics.org/archives/manifest/nnan0037">manifest</a>, and there's also a link to <a href="http://numismatics.org/mirador/?manifest=http://numismatics.org/archives/manifest/nnan0037">view</a> the manifest in our internal Mirador viewer. You can view the EAD XML (link at top) for more details.<br />
<br />
<h3 style="text-align: left;">
MODS updates</h3>
The updates to the MODS were twofold. First, in the previous version of Archer, all photographs were suppressed from the public regardless of copyright concerns. We have re-evaluated these concerns by applying one of several Rights Statements. Two of these rights statements are most permissible, and therefore, we will display the high resolution image when we have every right to do so. In any case, thumbnails are Fair Use, and therefore, they are always visible in the record page and the search results pages.<br />
<br />
Where copyright allows us to do so, the MODS file includes a URL for the reference image and a URL[@access='raw object' and @note='IIIFService']. When a IIIFService URL is present in the MODS record, the XSLT transformation will include a Leaflet div and initiate the display of the image. See <a href="http://numismatics.org/archives/ark:/53695/I00000554">A Portrait Photograph of Margaret Thompson</a>, for example. Like the finding aid, a manifest is dynamically generated from MODS, but only one XForms processor is called to extract the height and width from the info.json for the single image linked in the MODS file.<br />
<br />
<h3 style="text-align: left;">
Pelagios Updates</h3>
Since the Brett collection links many photographs to ancient places defined in the <a href="https://pleiades.stoa.org/">Pleiades Gazetteer of Ancient Places</a>, I have updated the EADitor RDF output for <a href="http://commons.pelagios.org/">Pelagios</a>. The output now includes IIIF service metadata conforming to the Europeana Data Model specification. Rainer Simon has imported these photographs into <a href="https://twitter.com/aboutgeo/status/918460929234423808">Peripleo</a>.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-56983183148037644802017-10-06T07:53:00.000-07:002018-05-24T08:14:18.219-07:00Newell notebooks migrated to IIIF<div dir="ltr" style="text-align: left;" trbidi="on">
As part of our transition to IIIF for high resolution photographs for the numismatic collection in <a href="http://numismatics.org/search">MANTIS</a> (see <a href="http://numismatics.org/collection/1944.100.45250">http://numismatics.org/collection/1944.100.45250</a> for example), I have begun to migrate our archival images into IIIF as well. These new features will be available in our new dedicated server as soon as the migration of Wordpress from one server to another is complete, which I expect in the next few weeks. The implementation of IIIF for our archival resources entails three overhauls of the current metadata model and HTML/IIIF Manifest serialization: TEI (for Newell notebooks of facsimile images), Encoded Archival Description (EAD) finding aids, and MODS. The transformation of the TEI notebooks into IIIF compliance is completed, and the functionality for EAD and MODS has been built, but the XML data have not been fully updated to link to IIIF services (mainly because the high resolution images haven't been uploaded to the server yet).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir6Rvy0xi7iX2uDW0GtkzH8dmFBa50U0uetyiyxYJkIDHLLA01GdkiEB9ohrzXZKkmeqoDZhlqXWzh7NH_SMaqj95qzQeuPTWWwj0Glfd8zwehcnPCBLk1XveeJcZnRE28xc0M7MaJ740/s1600/Screenshot+from+2017-10-06+10-52-00.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="992" data-original-width="1600" height="247" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir6Rvy0xi7iX2uDW0GtkzH8dmFBa50U0uetyiyxYJkIDHLLA01GdkiEB9ohrzXZKkmeqoDZhlqXWzh7NH_SMaqj95qzQeuPTWWwj0Glfd8zwehcnPCBLk1XveeJcZnRE28xc0M7MaJ740/s400/Screenshot+from+2017-10-06+10-52-00.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Annotated Newell notebook IIIF manifest displayed in Mirador</td></tr>
</tbody></table>
<br />
<br />
<h3 style="text-align: left;">
TEI to IIIF Manifest</h3>
The first Newell notebook was published to <a href="http://numismatics.org/archives">Archer</a> (built on <a href="https://github.com/ewg118/eaditor">EADitor</a>) more than three years ago. There are now about 50 notebooks published, but only a handful have been annotated to link to people, <a href="http://coinhoards.org/">IGCH hoards</a>, and coins in our collection (we will complete the annotation as part of the <a href="http://numismatics.org/neh-hrc2017/">Hellenistic Royal Coinages</a> project). To summarize the <a href="http://eaditor.blogspot.com/2014/06/first-newell-notebook-published-in.html">technical underpinnings</a>, each notebook is a TEI file with facsimile elements for each page. The facsimile contains a link to the image and 0-n surface elements representing annotations. These surface elements were created by roundtripping the <a href="https://annotorious.github.io/">Annotorious</a>/OpenLayers annotation JSON <-> TEI. The @ulx, @uly, @lrx, and @lry attributes represent the coordinates of the upper left and lower right hand corners of the annotations, and the coordinates were relative ratios based on OpenLayers bounds.<br />
<br />
For IIIF compliance, I ran the TEI through an XSLT 3 transformation to load the info.json metadata from our IIIF image server to extract the height and width of each image, and then recalculate the coordinates to be more in line with Web Annotation segments. The lower right coordinates are still stored in the TEI, but upon generation of annotation lists for the manifest, the left coordinates are subtracted to the right to correctly establish the annotation height and width.<br />
<br />
<pre><code> <surface lrx="1540" lry="155" ulx="1182" uly="54" xml:id="aho40v9vbhq7">
<desc>
<ref target="http://coinhoards.org/id/igch1516">IGCH 1516</ref>
</desc>
</surface>
</code></pre>
<br />
The tei:facsimile to annotation list transformation outputs:
<br />
<br />
<pre>http://numismatics.org/archives/manifest/nnan187715/canvas/nnan0-187715_X006#xywh=1182,54,358,101</pre>
<br />
<br />
The tei:graphic was replaced with tei:media[@type='IIIFService'], with the @url pointing to the IIIF service URI instead of an image location. XSLT transformations for the manifest, HTML, RDF, and Solr outputs do the rest.<br />
<br />
The Javascript has been updated so that clicking on a page under the index of annotations will force Mirador to change the the correct canvas.<br />
<br />
You can see an example here: <a href="http://numismatics.org/archives/id/nnan187715">http://numismatics.org/archives/id/nnan187715</a><br />
<br />
I will post another update on EAD and MODS -> IIIF next week. </div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-58218585200482002182017-08-10T09:27:00.003-07:002017-08-10T09:27:50.807-07:00First DOIs minted for ANS Digital Library items<div dir="ltr" style="text-align: left;" trbidi="on">
Several weeks ago, we migrated an older, circa 2002 TEI ebook on the Taranto 1911 hoard, authored by John Kroll and Sebastian Heath, into our <a href="http://numismatics.org/digitallibrary/ark:/53695/taranto1911">Digital Library</a>. The original TEI file and subsequent updates have been loaded into our <a href="https://github.com/AmericanNumismaticSociety/tei-ebooks">TEI Github repository</a>. The updates follow transcription precedents that we have set in older ANS-published printed monographs as part of the Mellon-funded Open Humanities Book Program: relevant places, objects, people, etc. have been linked to entities in LOD systems, such as <a href="http://nomisma.org/">Nomisma.org</a>. All of the objects within this hoard (itself linked to <a href="http://coinhoards.org/id/igch1874">IGCH 1864</a>) are in the British Museum and linked to their URIs. Upon publication into the ANS Digital Library, the document parts are now accessible from the IGCH 1864 record and in (eventually) in <a href="http://commons.pelagios.org/">Pelagios</a>, connected to relevant ancient places.<br />
<br />
Since Sebastian is an active scholar, with an <a href="http://orcid.org/0000-0003-2039-429X">ORCID</a>, this document served as a proof of concept for the next iteration of ANS digital publication: that our current and future monographs and journal articles, once issued openly online, should be connected to ORCIDs for their authors, and publication metadata should be submitted to Crossref to mint a DOI and enhance accessibility. Furthermore, since there's a direct connection between ORCID and Crossref submissions, this new digital publication workflow would automatically populate an author's scholarly profile with ANS publications. This is a vast improvement over the likes of Academia.edu, which requires manual submission. The broad vision is this:<br />
<br />
<i>Regardless of whether an author submits works through the American Numismatic Society Digital Library, Zenodo.org, Humanities Commons, their own institutional repository, or an Open Access journal system, their ORCID profile is the central, canonical aggregation of the entirety of their intellectual output (which includes datasets, software, etc.).</i><br />
<br />
This aggregation system between DOIs and ORCIDs, following Linked Open Data principles, is the future of academic publication. Ideally, it should be expanded beyond citations to modern works with DOIs and ORCIDs to include more historic works defined by Worldcat and linked to historic scholars with ISNI identifiers. It would take a tremendous amount of work, but in theory, it would be possible to create a network graph of citations across all disciplines, going back in history to the advent of the printed book, charting the evolution of how knowledge is generated and disseminated. Therefore, Crossref, ISNI, and ORCID would perhaps play a greater role than providing simple (and superficial) citation metrics in enabling us to develop a broader historiography and analysis of scholarship itself. We plan to mint DOIs for our historical publications eventually, if Crossref extends its XML schema to support ISNI identifiers.<br />
<br />
<h3 style="text-align: left;">
Under the Hood</h3>
Some extensions were implemented in <a href="https://github.com/AmericanNumismaticSociety/etdpub/">ETDPub</a>, the TEI/MODS publication framework that underlies the ANS Digital library. First, I authored XSLT stylesheets that would crosswalk TEI or MODS into the appropriate Crossref XML model according to their schema version 4.4.0. You can see an example of my MA thesis here: <a href="http://numismatics.org/digitallibrary/ark:/53695/gruber_roman_numismatics.xref">http://numismatics.org/digitallibrary/ark:/53695/gruber_roman_numismatics.xref</a>.<br />
<br />
XSLT:<br />
<ul style="text-align: left;">
<li><a href="https://github.com/AmericanNumismaticSociety/etdpub/blob/master/ui/xslt/serializations/tei/crossref.xsl">TEI to Crossref</a></li>
<li><a href="https://github.com/AmericanNumismaticSociety/etdpub/blob/master/ui/xslt/serializations/mods/crossref.xsl">MODS (ETD) to Crossref</a></li>
</ul>
If the author/editor URI matches an ORCID URI in the TEI, then the Admin panel in ETDPub will enable the publication of the metadata to Crossref. Similarly, within the MODS ETD editing interface (in XForms), a user can insert a mods:nameIdentifier[@type='orcid'] under the mods:name for an author/editor in order to capture the ORCID. So far, only TEI or MODS records with ORCIDs attached to people are available for submission into Crossref to mint a DOI.<br />
<br />
<h4 style="text-align: left;">
Submission Workflow</h4>
In the admin panel, if a document is eligible for submission to Crossref, a checkbox is available. Clicking on this will fire off a series of actions in the XForms engine:<br />
<ol style="text-align: left;">
<li>The TEI/MODS-to-Crossref XML transformation is executed and loaded into an XForms instance</li>
<li>The Crossref XML is serialized to /tmp because it must be attached via multipart/form-data</li>
<li>Still having difficulty getting multipart/form-data to execute correctly in the XForms engine, the XForms engine instead interacts with a PHP script in CGI</li>
<li>After the PHP script responds with a successful HTTP code, the MODS/TEI document is loaded in the XForms engine in order to insert the DOI in the proper location within the document</li>
<li>The TEI/MODS file is saved back to eXist, and the standard publication workflow is executed (a chain of XForms submissions), updating the Solr search index and the triplestore/SPARQL endpoint </li>
</ol>
So far two documents in the Digital Library have DOIs connected to ORCIDs:<br />
<br />
Taranto 1911: <a href="http://dx.doi.org/10.26608/taranto1911">http://dx.doi.org/10.26608/taranto1911</a><br />
My thesis (Recent Advancements in Roman Numismatics): <a href="http://dx.doi.org/10.26608/gruber_roman_numismatics">http://dx.doi.org/10.26608/gruber_roman_numismatics</a></div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-91420272238524654332017-07-14T12:34:00.000-07:002017-07-14T12:34:16.770-07:00Improved mapping in EADitor - Brett archaeology photos as a test<div dir="ltr" style="text-align: left;" trbidi="on">
At long last, I have migrated from OpenLayers to Leaflet in <a href="https://github.com/ewg118/eaditor">EADitor</a>. This required modifications in two areas: the HTML pages for rendering EAD finding aids and the map interface. As a result, I introduced two new serializations:<br />
<br />
<ul style="text-align: left;">
<li>The map interface renders Solr search results rendering into GeoJSON (instead of OpenLayers displaying Solr->KML as before)</li>
<li>A transformation of an EAD finding aid into GeoJSON. A GeoJSON point is created for all unique mappable places from <a href="http://www.geonames.org/">Geonames</a> or <a href="https://pleiades.stoa.org/">Pleiades</a>, and coordinates are extracted in real time by reading Geonames APIs or Pleiades RDF. The GeoJSON features include references to all uniquely addressable components that include that place in the controlaccess element. You can append the extension '.geojson' to get JSON response. Content negotiation will be implemented eventually. See <a href="http://numismatics.org/archives/ark:/53695/nnan0037.geojson">http://numismatics.org/archives/ark:/53695/nnan0037.geojson</a> for example.</li>
</ul>
<h3 style="text-align: left;">
</h3>
<h3 style="text-align: left;">
</h3>
<h3 style="text-align: left;">
Restructuring the Agnes Baldwin Brett finding aid</h3>
<a href="http://numismatics.org/authority/brett">Agnes Baldwin Brett</a> was a curator at the ANS from 1909-1912 and a prominent scholar of Greek numismatics. Our archives hold a variety of interesting materials, including photographs from her travels around Greece, Italy, and Turkey in the early 1900s. Numerous photos have been digitized, were uploaded to flickr Commons, and linked to the <a href="http://numismatics.org/archives/ark:/53695/nnan0037">Brett EAD finding aid</a>. Some photographs were identified and described (with brief text snippets) by ANS archivist, David Hill, but all photographs were placed in a single series-level component. All identifiable places were linked in EADitor's Geonames lookup mechanism in a top-level controlaccess element. There was no direct correlation between individual photographs and the people, places, and things depicted.<br />
<br />
In order to demonstrate the full functionality of the new mapping interface, I finally took the time to restructure the finding aid so that each photograph would appear in its own item-level component with a controlaccess element enabling individual identification of the place depicted in the photo. Furthermore, while many finding aids have been linked to modern places defined in Geonames, the Brett collection of archaeological photographs provided an opportunity to link photos to ancient places in Pleiades, which would, in turn, open the door to the integration of these valuable materials into the wider Linked Ancient World Data cloud via <a href="http://commons.pelagios.org/">Pelagios</a>. The photos feature Mycenaean tombs, Greek temples, and even the <a href="https://en.wikipedia.org/wiki/Grave_Stele_of_Hegeso">Grave Stele of Hegeso</a>.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmhnAZJ4NLmsxtiJDD1s03lBZAgqjIwQJneNOxF9DnT43aIQu9PV_koFlF_ch-wL9Yy58A3IJe_mEXteMj996d3OA7dHfmhuIUB-RcorCNINA_Qfy2mjBKb0Tdh4ZfBh1W9BvXWdKIOaM/s1600/Screenshot+from+2017-07-14+15-05-22.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="423" data-original-width="470" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmhnAZJ4NLmsxtiJDD1s03lBZAgqjIwQJneNOxF9DnT43aIQu9PV_koFlF_ch-wL9Yy58A3IJe_mEXteMj996d3OA7dHfmhuIUB-RcorCNINA_Qfy2mjBKb0Tdh4ZfBh1W9BvXWdKIOaM/s400/Screenshot+from+2017-07-14+15-05-22.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Identifying individual monuments within Athens</td></tr>
</tbody></table>
<br />
<br />
Not only that, some photographs feature other students from the American School of Classical Studies at Athens that went on to be prominent scholars later in life. Since many of these scholars have produced published works and archival materials held at other institutions, they have URIs in the <a href="http://socialarchive.iath.virginia.edu/">Social Network and Archival Context</a> project. EADitor has had SNAC lookups for quite some time, and so I was able to link photos to these URIs when applicable. I hope that we can make these photos available to researchers even beyond the ancient world.<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqvnngAmmSy3bRczL-_0GUgil1GICpe-tgOXAHlPfP-a6yW5ZCyHvGKJ7ksfHjjQ0o6BHJxCGLMJ3uTKtjEo4rWSdWOxzSW2b9NTLQ1pKn3jJWhZhQK2CWbGq2XVeZiGciqn4YNX69sF8/s1600/Screenshot+from+2017-07-11+10-28-42.png" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" data-original-height="942" data-original-width="994" height="378" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqvnngAmmSy3bRczL-_0GUgil1GICpe-tgOXAHlPfP-a6yW5ZCyHvGKJ7ksfHjjQ0o6BHJxCGLMJ3uTKtjEo4rWSdWOxzSW2b9NTLQ1pKn3jJWhZhQK2CWbGq2XVeZiGciqn4YNX69sF8/s400/Screenshot+from+2017-07-11+10-28-42.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Linking people to SNAC</td></tr>
</tbody></table>
In addition to the tagging of places and people, many photographs feature known archaeological monuments that are notable enough to warrant their own Wikipedia articles, and therefore Wikidata entity URIs. I extended the subject lookup mechanism in EADitor beyond the standard Library of Congress Subject Headings to query the Wikidata API, embedding entity IDs directly into the EAD finding aid, which are then transformed into dcterms:subject URIs upon RDF serialization.<br />
<br />
<h3 style="text-align: left;">
</h3>
<h3 style="text-align: left;">
EAD to RDF</h3>
Since each individual component has an ID in EADitor, each component is uniquely addressable by fragment identifiers, e.g., <a href="http://numismatics.org/archives/ark:/53695/nnan0037#d1e131">http://numismatics.org/archives/ark:/53695/nnan0037#d1e131</a>. After making some minor modifications to the RDF output to conform with the emerging schema.org archival extension, These Wikidata, SNAC, Pleiades, and Geonames URIs are exposed in the RDF for each component, which are hierarchically linked together.<br />
<br />
<blockquote class="tr_bq">
<pre><span class="k">@prefix </span><span class="nv">arch: </span><span class="nn"><http://purl.org/archival/vocab/arch#> .</span>
<span class="k">@prefix </span><span class="nv">dcterms: </span><span class="nn"><http://purl.org/dc/terms/> .</span>
<span class="k">@prefix </span><span class="nv">foaf: </span><span class="nn"><http://xmlns.com/foaf/0.1/> .</span>
<span class="k">@prefix </span><span class="nv">rdf: </span><span class="nn"><http://www.w3.org/1999/02/22-rdf-syntax-ns#> .</span>
<span class="k">@prefix </span><span class="nv">rdfs: </span><span class="nn"><http://www.w3.org/2000/01/rdf-schema#> .</span>
<span class="k">@prefix </span><span class="nv">schema: </span><span class="nn"><http://schema.org/> .</span>
<span class="k">@prefix </span><span class="nv">xml: </span><span class="nn"><http://www.w3.org/XML/1998/namespace> .</span>
<span class="k">@prefix </span><span class="nv">xsd: </span><span class="nn"><http://www.w3.org/2001/XMLSchema#> .</span>
<span class="nc"><http://numismatics.org/archives/ark:/53695/nnan0037#d1e131></span><span class="o"> a </span><span class="na">schema:ArchiveItem </span>;
<span class="o">dcterms:coverage </span><span class="na"><http://www.geonames.org/264371></span> ;
<span class="o">dcterms:date </span><span class="s">"1900-12-07"^^xsd:date </span>;
<span class="o">dcterms:identifier </span><span class="s">"06-00242" </span>;
<span class="o">dcterms:isPartOf </span><span class="na"><http://numismatics.org/archives/ark:/53695/nnan0037#c_92f631e3f903281a8cdedbfebfca0654></span> ;
<span class="o">dcterms:subject </span><span class="na"><http://socialarchive.iath.virginia.edu/ark:/99166/w61c5qjp></span> ;
<span class="o">dcterms:title </span><span class="s">"American School students wearing bug bags" </span>;
<span class="o">dcterms:type </span><span class="na"><http://vocab.getty.edu/aat/300046300></span> ;
<span class="o">foaf:depiction </span><span class="na"><http://farm9.staticflickr.com/8320/8003385533_c83827b679_o.jpg></span> ;
<span class="o">foaf:thumbnail </span><span class="na"><http://farm9.staticflickr.com/8320/8003385533_55f1f093b1_t.jpg></span> .</pre>
</blockquote>
<br />
This RDF is posted into Archer's SPARQL endpoint.<br />
<br />
<h3 style="text-align: left;">
Archer RDF → SPARQL → Pelagios RDF</h3>
Now that we have numerous uniquely addressable photographs linked to Pleiades URIs published in our SPARQL endpoint, it was a breeze to create an RDF export for Pelagios. It is essentially a DESCRIBE query, and our model of RDF is run through XSLT into the Pelagios data model.<br />
<br />
<blockquote>
<pre>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms: <http://purl.org/dc/terms/>
DESCRIBE ?s WHERE {
?s dcterms:coverage ?place FILTER (strStarts(str(?place), 'https://pleiades.stoa.org'))
}</pre>
</blockquote>
<br />
The link to the Pelagios VoID is available on the front page of Archer. It is generated by an ASK query similar to above to see whether there are any objects in the SPARQL endpoint with Pleiades places expressed by the dcterms:coverage property.<br />
<br />
<h3 style="text-align: left;">
Summary</h3>
The Brett collection is incredibly interesting, and I hope that we will be able to digitize more photographs and the corresponding travel diary at some point in the future. There are still many photographs that haven't been identified, and so perhaps we might be able to accomplish this through crowdsourcing. We will implement a IIIF server by the end of summer and begin the transition of our archival materials into IIIF--not only photographs, but also the Newell diaries. Perhaps one day we will be able to annotate the people, places, and things from the Brett diary and photographs with <a href="http://projectmirador.org/">Mirador</a> or a similar IIIF viewer. While Pelagios integration is somewhat imminent, the aggregation of disparate archival holdings through shared SNAC identifiers is still further along the horizon.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-4791288980536059712017-02-28T14:29:00.000-08:002017-02-28T14:43:52.221-08:00Final four Mellon-funded TEI ebooks published<div dir="ltr" style="text-align: left;" trbidi="on">
The final four of a group of 86 American Numismatic Society-published books have been checked and uploaded to our <a href="http://numismatics.org/digitallibrary">Digital Library</a>. Here are some stats I was able to produce from various SPARQL queries of the TEI->Open Annotation RDF:<br />
<br />
<ul style="text-align: left;">
<li>349 mentions of 164 different Greek coin hoards published in <a href="http://coinhoards.org/">IGCH</a> in 193 sections in 14 books.</li>
<li>266 unique references to <a href="http://nomisma.org/">nomisma</a> URIs. 146 are mints or regions, and 87 of these identifiers are matches with <a href="https://pleiades.stoa.org/">Pleiades</a> places. These mint references appear in 600 sections 51 books. Including direct Pleiades references (and not only those which are implicit by means of Nomisma concordances), there are 621 sections in these 51 books which will be accessible through the <a href="http://commons.pelagios.org/">Pelagios Project</a>.</li>
<li>97 of the 266 references are to people, most of whom are linked to <a href="https://www.wikidata.org/">Wikidata</a> and <a href="http://viaf.org/">VIAF</a> entities that are, in turn, linked to other systems, such as <a href="http://socialarchive.iath.virginia.edu/">Social Networks and Archival Context</a></li>
<li>More than 1,400 coins in the <a href="http://numismatics.org/search">ANS collection</a> are referenced</li>
<li>139 Roman Imperial coin types in <a href="http://numismatics.org/ocre">OCRE</a></li>
<li>4 Roman Republican coin types in <a href="http://numismatics.org/crro">CRRO </a></li>
</ul>
These four are the final of 86 total books digitized as part of the NEH-Mellon Open Humanities Book program. Many thanks to both the National Endowment for the Humanities and the Mellon Foundation for making this possible. The framework and methodologies implemented in this project will be applied to further digitization here at the ANS as we move toward making our entire collection of monographs freely and openly accessible, and I hope that other academic publishers and learned societies will follow in our footsteps in this endeavor.<br />
<br />
These books go beyond simple transcription and publication as EPUB files. With links to our own research databases internally and externally to Linked Open Data information systems, we hope that these works will be transformed into research portals to further context about the people, places, events, etc. mentioned in the text. On the other side of the coin, so to speak, researchers interested about the entities, objects, coin hoards, etc. will have access to a wealth of historical information about these things and will gain access to our monographs not only from our own Library, Archive, and Museum systems, but through projects like Pelagios, <a href="https://dp.la/">Digital Public Library of America</a>, and other large scale aggregators of cultural heritage materials.<br />
<ul style="text-align: left;">
</ul>
</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-31287110396758256492017-01-13T12:21:00.002-08:002017-01-13T12:37:09.740-08:00More than 80 LOD-enhanced ebooks published to the ANS Digital Library<div dir="ltr" style="text-align: left;" trbidi="on">
The American Numismatic Society has nearly completed its Mellon Foundation-funded Humanities Open Book program. Eighty-two of 86 books have been enhanced by a Whitney Christopher, a TEI specialist from the King's College London DH program to link to people and places defined on <a href="http://nomisma.org/">Nomisma.org</a>, <a href="http://pleiades.stoa.org/">Pleiades</a> (either directly linked or by means of Nomisma's internal concordance system), VIAF, Wikidata, and the <a href="http://numismatics.org/authorities">ANS's own archival authority control system</a>. The final four books will go online soon. They are all available in the <a href="http://numismatics.org/digitallibrary">ANS Digital Library</a>.<br />
<br />
The number of people and places mentioned in these texts is a staggering figure, and it should be noted that we have focused on linking those entities that are most relevant to the texts, but we will continue to refine the linking over time, especially when it comes to Nomisma concepts and bibliographic references to Worldcat Works (links to which have not yet been incorporated). As Nomisma expands further into the Greek world and other domains of numismatics (after the ancient period), we will return to these ebooks to insert or replace links to Nomisma mints, people, and political entities.<br />
<br />
Beyond relevant people and places, we have inserted hundreds of links to <a href="http://coinhoards.org/">IGCH</a> records (about 170 different coin hoards are cited in 400 locations in a handful of books), to the ANS collection, and to coin types defined in OCRE or CRRO. So far, more than 100 coins in the ANS and 6 in the Smithsonian American Art Museum have been identified by their accession numbers, although one of the four remaining books to be published will soon include nearly 70 more links to ANS coins. There are many more coins referenced in these books that may <i>now</i> belong to the ANS, but were not accessioned at the date of publication. A curator with more specific knowledge will need to identify these in the future.<br />
<br />
One of the most often cited hoard is the <a href="http://coinhoards.org/id/igch1664">Demanhur Hoard</a> (IGCH 1664), which is mentioned in four books and on various pages of two of <a href="http://numismatics.org/authority/newell">Edward Newell</a>'s notebooks. By linking archival authorities mentioned in these texts, we have greatly enhanced access to the works by and about Edward Newell and other prominent numismatic figures associated with the Society. A user of the ANS's authority portal (built on EAC-CPF) will have access to books written by Newell in our digital library, as well as his archival materials. Furthermore, mentions of Newell from the books written by other scholars will appear under <a href="http://numismatics.org/authority/newell#annotations">annotations</a>. In his case, he is mentioned in 18 other books, sometimes in multiple sections.<br />
<br />
Like Mantis, the OCRE and CRRO config files have been updated to link to our archival SPARQL endpoint, and therefore annotations about specific types are accessible directly through types defined in these system. Nearly 50 types in OCRE are linked from <a href="http://numismatics.org/digitallibrary/ark:/53695/nnan8359">Roman Medallions</a>, and a researcher can drill down into a specific section of the book from <a href="http://numismatics.org/ocre/id/ric.5.gall_sala(2).1">RIC 5 Gallienus and Salonina 1</a>.<br />
<br />
Finally, through the links to Pleiades, each section in each book that mentions an ancient place will be accessible in <a href="http://commons.pelagios.org/">Pelagios</a>.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-3858755313923235132016-09-26T10:39:00.002-07:002016-09-26T10:41:37.423-07:00Publication of the NEH/Mellon Open Humanities ebooks<div dir="ltr" style="text-align: left;" trbidi="on">
About a month ago, we pushed about 85 TEI files into production in the <a href="http://numismatics.org/digitallibrary/">ANS Digital Library</a>. These ebooks were transcribed from HathiTrust scans as part of the NEH/Mellon Open Humanities Book Program. Not all of the books have value-added tagging yet. We hired a TEI specialist several weeks ago to begin the process of linking coins, coin types, hoards, people, places, and other subject matter in the body of these books to URIs in our or other information systems.<br />
<br />
So far three of these books are complete: <br />
<ol style="text-align: left;">
<li><i><a href="http://numismatics.org/digitallibrary/ark:/53695/nnan24928">The Fifth Dura Hoard</a></i></li>
<li><i><a href="http://numismatics.org/digitallibrary/ark:/53695/nnan20591">The Earliest Coins of Norway</a></i></li>
<li><i><a href="http://numismatics.org/digitallibrary/ark:/53695/nnan19167">The Medallic Work of A.A. Weinman</a></i></li>
</ol>
Like the first book published into our Digital Library (Noe's Coin Hoards), the TEI links have been transformed into RDF conforming to Open Annotation, and these annotations are available in our other systems. For example, <a href="http://numismatics.org/authority/saltus">J. Sanford Saltus</a> is referenced in <i>The Medallic Work of A.A. Weinman</i>, and so this annotation is available in the biography of Saltus in our EAC-CPF-driven authority system.<br />
<br />
Most of the remaining books should have completed value-added TEI markup by the end of the year. </div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-58086045498596327222016-03-17T09:17:00.002-07:002016-03-17T09:17:33.942-07:00First EBook published as part of Mellon/NEH Humanities Open Book Project<div dir="ltr" style="text-align: left;" trbidi="on">
This is a follow-up to some major feature additions in <a href="http://numismatics.org/search">MANTIS</a> and <a href="http://coinhoards.org/">IGCH</a> detailed on the <a href="http://numishare.blogspot.com/2016/03/updating-mantis-and-igch-incorporating.html">Numishare blog</a>.<br />
<br />
Today, we have published our first out of print, open access EBook for the <span id="goog_34571069"></span><a href="http://www.neh.gov/grants/odh/humanities-open-book-program">NEH/Mellon Foundation Humanities Open Book Program</a>. It is Sydney Noe's 1920 <i><a href="http://numismatics.org/digitallibrary/ark:/53695/nnan146115">Coin Hoards</a>, </i>the first issue of Numismatic Notes and Monographs. As we discussed in our grant application, we had a vendor transcribe these PDFs of images we received from <a href="https://www.hathitrust.org/">HathiTrust</a> into TEI. The TEI is run through a normalization XSLT stylesheet to correct some issues and pull bibliographic metadata from various sources, and then value-added tagging is applied to link to coins in our collection, hoards on coinhoards.org, and entities in various geographic gazetteers or linked open data vocabulary systems.<br />
<span id="goog_34571070"></span><br />
<span id="goog_34571070">As a result, we not only have a digital text that you can view in your browser as HTML5 or download as an EPUB 3.0.1, but a richly-tagged document that is exposed as RDF conforming to Open Annotation, which is then published into our archival SPARQL endpoint (and soon published into <a href="http://pelagios.org/">Pelagios</a>).</span> Many of the technical features of this publication process have already been discussed in this blog or in the post linked above.<br />
<br />
This framework is part of a broader effort to integrate all of our Library, Archive, and Museum holdings into a central hub for numismatic research. It is therefore possible to gain further insight about the people, places, and things mentioned in these digital publications through Linked Open Data methodologies, but also to provide greater context to our data-driven numismatic research projects like IGCH, <a href="http://numismatics.org/ocre">OCRE</a>, etc.<br />
<br />
Not only do we have a rich set of interlinked numismatic projects focusing on hoards, coins, and coin types, but now between these things and numismatic monographs and journals, archival research notebooks, finding aids, and authority records. Not only is it possible to read biographical information about <a href="http://numismatics.org/authority/noe">Sydney Noe</a> in Archer, you can view a map and timeline of his life, his <a href="http://eaditor.blogspot.com/2016/01/sparql-based-social-network-graph-in.html">social network graph</a>, and gain access to a list of materials written by or about him.<br />
<br />
This is the topic of my CAA presentation in Oslo in a few weeks.<br />
<br /></div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-8284963209117649592016-03-11T12:20:00.001-08:002016-03-11T12:25:35.835-08:00Toward a more thoroughly integrated numismatic research system<div dir="ltr" style="text-align: left;" trbidi="on">
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
I
am making updates to our systems in preparation for the initial
publication of <a href="http://www.neh.gov/grants/odh/humanities-open-book-program">NEH/Mellon</a> EBooks. Part of the project is to thoroughly
integrate these EBooks with our <a href="http://numismatics.org/search/">collection</a>, <a href="http://numismatics.org/archives/">archives</a>, <a href="http://coinhoards.org/">IGCH</a>, and related
project databases. I still have some work to do, but should have the first EBooks ready next week.</div>
I
updated the RDF model for our digitized Newell notebooks to conform to
the model for our EBooks (Open Annotation) (there is one book published so far, the <a href="http://numismatics.org/digitallibrary/ark:/53695/Miller-ANS-Medals">ANS Medals book by Miller</a>). What this means is that mentions of IGCH, other
scholars represented in our biographies site, and [soon] individual
coins in Newell's notebooks will be made available through those other
interfaces.</div>
<br />
See <a href="http://coinhoards.org/id/igch1399">http://coinhoards.org/id/igch1399 </a></div>
<br />
<ul style="text-align: left;">
<li>You can click on individual pages where Newell notes IGCH1399, and the page will load in Archer.</li>
<li>You
can see a list of coin types from this hoard, and you can download the
list of coin types or a full list of coins from the hoard (note that we
aren't publishing our Greek coins that aren't connected to coin type
URIs in <a href="http://nomisma.org/sparql">nomisma.org's SPARQL endpoint</a>).</li>
</ul>
</div>
</div>
<br />
On <a href="http://numismatics.org/authority/id/newell">http://numismatics.org/authority/id/newell</a> (an EAC-CPF authority record)</div>
<br />
These already functioned --</div>
<ul style="text-align: left;">
<li>See a list of archival materials about Edward Newell</li>
<li>(Fairly
new) Several annotations in Miller's Medallic Arts of the ANS where he
mentions Newell. You can click a link to go directly to a section.</li>
<li>A social network graph showing Newell and his relations (also driven by SPARQL, detailed <a href="http://eaditor.blogspot.com/2016/01/sparql-based-social-network-graph-in.html">here</a>).</li>
</ul>
</div>
</div>
</div>
<br />
On <a href="http://numismatics.org/authority/id/noe">http://numismatics.org/authority/id/noe</a></div>
<ul style="text-align: left;">
<li>As before, you can get a list of archival materials about Noe</li>
<li>Newell mentions Noe on two pages of a notebook</li>
</ul>
</div>
</div>
<br />
Next steps:</div>
<ol style="text-align: left;">
<li>Update the code for <a href="http://numismatics.org/search/">Mantis</a> to display annotations about specific coins referenced in Newell's notebooks or our EBooks.</li>
<li>Update the Pelagios exports for the Digital Library and Archer to make our EBooks and archival materials more broadly accessible to the ancient world community</li>
<li>Build widgets into our <a href="http://numismatics.org/digitallibrary/">Digital Library</a> to pull data from our other systems</li>
</ol>
</div>
<div>
<div>
<br />
This
interlinking will be inherent to the publication mechanism for our
EBooks. When we publish the first several next week, the annotations
will be available in Mantis, the Archer Biographies, IGCH, etc.</div>
<div>
<br /></div>
<div>
I will be discussing these things and more in my presentation at <a href="http://caaconference.org/">CAA</a> in Oslo at the end of the month.</div>
</div>
</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-86489816933447960712016-01-28T09:08:00.001-08:002016-01-28T09:31:51.317-08:00SPARQL-based social network graph in xEAC<div dir="ltr" style="text-align: left;" trbidi="on">
I pushed into production a new SPARQL-based social network graph feature in <a href="https://github.com/ewg118/xEAC">xEAC</a>. The most interesting places to start are <a href="http://numismatics.org/authority/newell">http://numismatics.org/authority/newell</a> and <a href="http://numismatics.org/authority/new_york_numismatic_club">http://numismatics.org/authority/new_york_numismatic_club</a>, but we have a lot of work to do to enhance the linkage between our authorities in order to make these visualizations more useful in the future.<br />
<br />
Nearly a year ago, I began implementing a new <a href="http://www2.archivists.org/groups/technical-subcommittee-on-eac-cpf/encoded-archival-context-corporate-bodies-persons-and-families-eac-cpf">EAC-CPF</a> to RDF data model that could represent a graph of relationships in order to begin experimenting with rendering a social network graph in real time. After investigating the open source Javascript graph visualization tools, I choose <a href="http://visjs.org/">vis.js</a>, as it was powerful, easy to use, and could load JSON on the fly. I got a very basic graph working a year ago in time for <a href="http://scholarslab.org/digital-humanities/moving-peoplelinking-lives-dh-symposium/">Moving People, Linking Lives</a> at the University of Virginia, but it wasn't interactive, in that you could not expand beyond the first level of nodes connected to the authority record you were immediately viewing.<br />
<br />
After launching our first <a href="http://eaditor.blogspot.com/2016/01/first-ebook-published-to-ans-digital.html">EBook</a> a few weeks ago in <a href="https://github.com/AmericanNumismaticSociety/etdpub">ETDPub</a> (which is integrated with <a href="http://numismatics.org/authorities/">our production installation</a> of xEAC), I decided to revisit xEAC development of the social network graph interface.<br />
<br />
<h3 style="text-align: left;">
The Model</h3>
The RDF model implements bits and pieces of various standard ontologies. People, corporate bodies, and families have separate URIs for their entity represented as a Concept and as a Thing. The Concept (skos:Concept) of a person can be linked to concepts of that person in other vocabulary systems, like the Getty ULAN, VIAF, Wikidata, or <a href="http://socialarchive.iath.virginia.edu/">SNAC</a>. This is also the data object where you may also include provenance about the creation and modification of the object record. For example, dcterms:created applied for a foaf:Person would imply that the person was born on the given date, but when used in a skos:Concept, this implies that the concept data object would have been created in the data system at the given date.<br />
<br />
The Concept object is connected to the Thing object with foaf:focus.<br />
<br />
The Thing object contains mainly biographical information, using the <a href="http://vocab.org/bio/">bio ontology</a>. While much work remains to be done to link individuals to events, basic birth and death dates are represented, as well as a string of bio:relationships. Each bio:Relationship object contains a property defining the nature of the relationship and the target entity of the relationship. I will probably revisit the properties by which people are linked to organizations (using the <a href="https://www.w3.org/TR/vocab-org/">org ontology</a> more properly), but the model does function well enough to generate a graph of relationships.<br />
<br />
<h3 style="text-align: left;">
SPARQL to Vis.js JSON</h3>
Vis.js renders two JSON models, one for nodes and the other for edges, into a visual graph following HTML5 standards. Essentially, I had to build two web services in xEAC that would deliver these JSON models that could be read in real time via Ajax. The underlying model for these services is the <a href="https://www.w3.org/TR/sparql11-query/">SPARQL</a> query, and the views are generated with two different XSLT stylesheets to generate the JSON that vis.js requires to render the graph. The query is this:<br />
<br />
<br />
<pre>SELECT ?sourceName ?type ?target ?name ?class WHERE {
<uri> <URI> foaf:name ?sourceName ;
bio:relationship ?rel .
?rel xeac:relationshipType ?type ;
bio:participant ?target .
?target foaf:name ?name ;
a ?class
}</uri></pre>
<br />
Essentially, we get all of the relationships connected to a particular entity (URI), the type of relationship (e.g., rel:spouseOf), and the target entity, whether another URI in the system or a blank node. The SPARQL response is processed and serialized into JSON. When clicking on connecting nodes in the graph visualization--if the target node is <i>not</i> a blank node RDF object (therefore, another authority in xEAC)--vis.js fires off another Ajax call to create new nodes and edges. Arrows in the graph visualization indicate the directionality of the relationship.<br />
<br />
I should say that this is just the first phase of social network graph visualization in xEAC. While I have focused mainly on visualizing relationships on the level of the individual authority, my goal is to expand the application to implement a more sophisticated query interface that allows users to select arbitrary parameters to generate their own visualizations. For example, a user may want to view all persons grouped together by family or corporate body. Or group people by occupation or filter by date or place. All of these things are possible by reconceptualizing EAC-CPF into RDF graphs and developing the SPARQL queries that can be rendered into JSON for vis.js.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-40594966215150621302016-01-13T12:51:00.001-08:002016-01-13T12:51:12.870-08:00Survey to help usability testing<div dir="ltr" style="text-align: left;" trbidi="on">
I have created a short questionnaire in a Google form to aid in usability testing for our TEI -> EPUB serialization. It is available at <a href="https://docs.google.com/forms/d/10Prvpm5eDvjNZaeqgXZ7luLeSkVrOgZ3hJX5zjFBuSg/viewform">https://docs.google.com/forms/d/10Prvpm5eDvjNZaeqgXZ7luLeSkVrOgZ3hJX5zjFBuSg/viewform</a><br />
<br />
You can download the EPUB file for the Miller <a href="http://numismatics.org/digitallibrary/id/Miller-ANS-Medals"><i>Medallic Arts of the American Numismatic Society</i></a> book <a href="http://numismatics.org/digitallibrary/id/Miller-ANS-Medals.epub">here</a>.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com1tag:blogger.com,1999:blog-3664503123885891673.post-48459309455175782742016-01-13T12:00:00.003-08:002016-01-13T12:00:50.316-08:00First EBook published to ANS Digital Library<div dir="ltr" style="text-align: left;" trbidi="on">
This afternoon, we have published our first EBook to the ANS Digital Library in the <a href="https://github.com/AmericanNumismaticSociety/etdpub">ETDPub</a> framework. This EBook, <a href="http://numismatics.org/digitallibrary/id/Miller-ANS-Medals">Medallic Art of the American Numismatic Society, 1865–2014</a> by Scott Miller, is encoded in TEI and has been issued with a Creative Commons BY-NC license. While the TEI file has not been fully linked into name and place authority files, I was able to use regex to link to medals in the American Numismatic Society collection and link one prominent scholar, <a href="http://numismatics.org/authority/newell">Edward T. Newell</a>, to our archival authority record. The TEI file will be fully linked up later, but the publication of this EBook can be seen as a completely functional demonstration of the technical application of linked open data principles to publishing these types of books for the larger <a href="http://www.neh.gov/grants/odh/humanities-open-book-program">NEH-Mellon Humanities Open Book</a> project.<br />
<br />
The TEI is indexed into Solr by ETDPub for full-text search, but this is only the beginning of this system's features. Using <a href="http://wiki.orbeon.com/forms/doc/developer-guide/xml-pipeline-language-xpl">Orbeon's XPL pipelines</a>, we are able to cobble together a series of XSLT transformations of the TEI file into XHTML and other XML files (NCX, OPF) required by the EPUB 3.0.1 specification. There is a link to the EPUB download on the page for the EBook, and the EPUB file is generated dynamically. I have tested on an ereader application on my desktop (Ubuntu) and a few on my Android phone. They mostly seem to work well, but the table of contents isn't consistently functional, but this seems to be more of an issue of the individual app not supporting EPUB 3.0.1 correctly rather than the EPUB file itself. I plan to put together a survey to assist in usability testing.<br />
<br />
It is also important to note that the focus with EPUB serialization so far has been almost solely on functionality. The XSLT stylesheets are very basic and I have applied almost no CSS styling, but there is potential in enhancing the overall aesthetic of the document. That will come later as functional issues are ironed out. I am aware the tables do not seem to render properly.<br />
<br />
The other major feature of ETDPub's TEI publishing is serialization into RDF (so far, XML, but JSON-LD and Turtle outputs are coming). This RDF is fairly rudimentary so far. The RDF contains a data object for the book as a whole (and associated metadata in dcterms, like creator and publisher) and for each child div, using dcterms:isPartOf to link the hierarchical structure of the book together. Furthermore, any link (ref element) within the lowest level relevant div and any name that has been linked via the @corresp attribute to an authoritative URI in the teiHeader will be rendered as an annotation following the <a href="http://www.openannotation.org/spec/core/">Open Annotation</a> model. ETPub is capable of executing CRUD operations with an SPARQL 1.1-compliant endpoint, and so the Digital Library application is posting into the triplestore that links our archival objects and authority records together. I have <a href="http://eaditor.blogspot.com/2014/06/ans-archives-v2-has-gone-live-ead-eac.html">previously discussed</a> linking <a href="https://github.com/ewg118/xEAC">xEAC</a> and <a href="https://github.com/ewg118/eaditor">EADitor</a> together via SPARQL, and now ETPub is capable of doing the same. The authority record for Newell now includes links to the sections in the ANS medals book in which he was mentioned, in addition to the research notebooks and photographs associated with Newell that are contained in the ANS archives. We are moving forward with linking our library, archives, and collection more closely together internally, as well as paving the way for scholars to gain further context with data sources outside the ANS.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com8tag:blogger.com,1999:blog-3664503123885891673.post-48166358914311174042015-12-17T12:55:00.000-08:002015-12-17T13:02:22.673-08:00ANS Awarded Funding for NEH/Mellon Foundation’s Humanities Open Book Project<div dir="ltr" style="text-align: left;" trbidi="on">
The American Numismatic Society has been chosen as one of ten publishers
to participate in <a href="https://mellon.org/news-publications/articles/humanities-open-book/">Humanities Open Book project</a>, a joint <a href="http://neh.gov/">NEH</a>-<a href="https://mellon.org/">Mellon Foundation</a> grant program to convert out-of-print books of enduring
scholarship into EPUB e-books licensed to allow readers to search and
download these books freely, and to read them on any type of e-reader.
The ANS is the only learned society to receive funding for this
initiative.<br />
<br />
“The large number of valuable scholarly books in the
humanities that have fallen out of print in recent decades represents a
huge untapped resource,” said NEH Chairman William Adams. “By placing
these works into the hands of the public we hope that the Humanities
Open Book program will widen access to the important ideas and
information they contain and inspire readers, teachers and students to
use these books in exciting new ways.”<br />
<br />
ANS publications date back
to 1866 and include over 500 volumes of numismatic scholarship. Thanks
to the funding received from the Mellon Foundation, nearly 100 of its
rarest out-of-print books will be converted into free EPUB digital
editions. The ANS will go one step further by <a href="http://www.tei-c.org/index.xml">TEI</a>-encoding these
editions for online viewing, searching, and linking. Following
best-practices of <a href="http://linkeddata.org/">Linked Open Data</a> (LOD), these XML files will link to
(and will be able to be linked from) other Open Access (OA) resources in
the Humanities, benefiting researchers in history, archaeology, art
history, geography, and other disciplines.<br />
<br />
“Scholars in the
humanities are making increasing use of digital media to access
evidence, produce new scholarship, and reach audiences that increasingly
rely on such media for information to understand and interpret the
world in which they live,” said Earl Lewis, President of the Andrew W.
Mellon Foundation.<br />
<br />
“Knowledge wants to be free,” Andrew
Reinhard, ANS Director of Publications said. “This grant will help the
ANS put even more of its collections online for free and open access for
anyone who wants it.” The ANS continues its ongoing, longtime
commitment to digitization and databases having placed over 600,000
objects online—more than 100,000 of which have been photographed—while
contributing tens of thousands of coin records via international
projects such as Online Coins of the Roman Empire (OCRE) and PELLA:
Coinage of the Macedonian kings of the Argead dynasty. Thanks to the
Mellon grant, the ANS can continue to add its publications to this suite
of OA materials.<br />
<br />
Ethan Gruber, the ANS’s Director of Data
Science, said “this is an important project that will enable us to
further integrate our numismatic collection, archival materials, and
digital library into a cohesive platform to further not only the study
of coins, but also the study of the evolution of numismatics.”<br />
<br />
“On
behalf of the Trustees and staff of the Society, I would like to thank
the Andrew W. Mellon Foundation for their generous support of this
exciting project,” Ute Wartenberg Kagan, Executive Director of the ANS,
said.<br />
<br />
The Mellon-funded EPUB and TEI-encoded publications will be available by the end of 2016.<br />
<br />
For more information, contact Andrew Reinhard, Director of Publications, at areinhard@numismatics.org.<br />
<br />
The full list of works to be made publicly accessible as EBooks through this program is available at <a href="https://drive.google.com/file/d/0B0qn_O39OBdmZXZiVjdJZ2pDQjg/view?usp=sharing">https://drive.google.com/file/d/0B0qn_O39OBdmZXZiVjdJZ2pDQjg/view?usp=sharing </a></div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-53246628556820301882015-12-04T08:47:00.001-08:002015-12-04T08:47:36.387-08:00The ANS Digital Library, a Look Under the Hood<div dir="ltr" style="text-align: left;" trbidi="on">
The ANS <a href="http://numishare.blogspot.com/2015/09/the-american-numismatic-society.html">announced</a> the launch of its <a href="http://numismatics.org/digitallibrary">Digital Library</a> few months ago. There are only a few items in the repository at the moment, but we will be expanding in the very near future to include journal articles and open access EBooks. This blog post will introduce some of the technical concepts behind the open source DL framework, <a href="https://github.com/AmericanNumismaticSociety/etdpub">ETDPub</a>.<br />
<br />
The idea that initially drove our framework was the desire to make numismatic theses and dissertations more widely and freely accessible. Andrew Reinhard, ANS Director of Publications, came to me in the late summer to put together something very basic that we could launch at the <a href="http://www.xvcin.unime.it/">INC in Taormina</a> in late September. At first, I looked into an off the shelf tool called <a href="https://www.tdl.org/etds/">Vireo</a>, developed by the Texas Digital Library. However, this platform was designed for the phases of dissertation review and publication into an institutional repository at a university. It is a backend-only with no front-end to speak of. The only solution was to build something effective quickly. The basic specifications for ETD publication were: an interface for basic metadata entry, and upload mechanism for PDFs or other documents, and a front end to provide the public with access to the documents.<br />
<br />
Since I've done a lot of <a href="http://www.w3.org/TR/2012/WD-xforms20-20120807/">XForms</a> development upon library metadata standards in the past, and since nearly all of our applications are already built in <a href="https://en.wikipedia.org/wiki/XRX_%28web_application_architecture%29">XRX/SPARQL</a> design concepts in <a href="http://www.orbeon.com/">Orbeon</a>, we opted to use Orbeon for this framework as well. We put together a basic <a href="http://www.loc.gov/standards/mods/">MODS</a> template for electronic theses and dissertations and an XForms editor to handle data entry, document upload, and web service interaction. Like <a href="https://github.com/ewg118/eaditor">EADitor</a>, <a href="https://github.com/ewg118/xEAC">xEAC</a>, <a href="https://github.com/ewg118/numishare">Numishare</a>, etc. there are lookup mechanisms for the <a href="http://vocab.getty.edu/">Getty LOD thesauri</a>, <a href="http://www.geonames.org/">Geonames</a>, <a href="http://viaf.org/">VIAF</a>, <a href="http://nomisma.org/">Nomisma.org</a>, <a href="http://pleiades.stoa.org/">Pleiades</a> for ancient geography, and LSCH from the <a href="http://id.loc.gov/">Library of Congress</a>. In even includes lookups for authority records from a xEAC installation (like EADitor). We went from development to production in the first version of the framework in about a week.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheOscRxihjZfOagAXq0AfN6kPu0hAIAxUXalntWZPGvKIZerJQ8l_OMcSuX3P1hUE-ZmYqIvZAMxB4Y2zMF8BMSuO4Y6BmqXKSaz5bDLmeQuKqr9-cuup1DRsUVZVDIVFikjRg0BvOsTE/s1600/Screenshot+from+2015-12-04+11%253A11%253A49.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheOscRxihjZfOagAXq0AfN6kPu0hAIAxUXalntWZPGvKIZerJQ8l_OMcSuX3P1hUE-ZmYqIvZAMxB4Y2zMF8BMSuO4Y6BmqXKSaz5bDLmeQuKqr9-cuup1DRsUVZVDIVFikjRg0BvOsTE/s400/Screenshot+from+2015-12-04+11%253A11%253A49.png" width="400" /></a></div>
<br />
Saving the MODS file writes it to an eXist XML database, publishes the metadata to Solr, and indexes the document file into Solr for full-text searching using the <a href="https://wiki.apache.org/solr/ExtractingRequestHandler">ExtractingRequestHandler</a>. Yesterday, I extended the publication functionality to serialize MODS into RDF to post triples in a SPARQL endpoint. This draws content from our Digital Library into our archival platforms built on EADitor and xEAC. We are digitizing auction catalogs, books, and journals edited or authored by prominent numismatic scholars that also played a role in the Society, and therefore have <a href="http://eac.staatsbibliothek-berlin.de/">EAC-CPF</a> records in the <a href="http://numismatics.org/authorities/">ANS Biographies</a> service. For example, our Digital Library contains one auction catalog edited by <a href="http://numismatics.org/authority/adams_edgar">Edgar H. Adams</a>. The metadata from this catalog are published to the SPARQL endpoint, and two items from our archive (an <a href="https://www.loc.gov/ead/">EAD</a> finding aid and a photograph described in MODS) are also available from the biographical page in the Adams authority record. This is the ideal model for larger-scale aggregation of cultural heritage content associated with archival authorities. It is nearly impossible to maintain these connections by hard-coding resourceRelation elements in the EAC-CPF record.<br />
<br />
So now we have three standalone software frameworks that comprise our digital library and archive, all connected together via linked open data methodologies. The next step is to begin integrating coins from our collection into this broader network of numismatic information.<br />
<br />
We will begin this work soon with the digitization of ANS monographs. These books contain references to coins in our collection, to hoards published on coinhoards.org, to materials in our archive, and to numismatic concepts defined on nomisma.org.<br />
<br />
ETDPub already supports the publication of TEI and dynamic serialization of TEI into EPUB 3.0.1.<br />
<br />
More details soon.</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-78899224328613763002014-10-07T06:22:00.002-07:002014-10-07T06:22:34.304-07:00xEAC at DCMI 2014<div dir="ltr" style="text-align: left;" trbidi="on">
I am heading to Austin, Texas this week to discuss <a href="https://github.com/ewg118/xEAC">xEAC</a>, with an illustration of linked open data principles applied to archival authorities and collections. This presentation is part of a full day pre-conference workshop at DCMI 2014 detailing the latest advances in digital archives entitled "<a href="http://dcevents.dublincore.org/IntConf/index/pages/view/2014-archives">Fonds & Bonds: Archival Metadata, Tools, and Identity Management</a>." Below is my presentation:<br />
<br />
<div style="text-align: center;">
<br />
<br />
<iframe allowfullscreen="" frameborder="0" height="356" marginheight="0" marginwidth="0" scrolling="no" src="//www.slideshare.net/slideshow/embed_code/39968363" style="border-width: 1px; border: 1px solid #CCC; margin-bottom: 5px; max-width: 100%;" width="427"> </iframe> <br />
<div style="margin-bottom: 5px;">
<a href="https://www.slideshare.net/ewg118/xeac-xforms-for-eaccpf" target="_blank" title="xEAC: XForms for EAC-CPF">xEAC: XForms for EAC-CPF</a>
</div>
</div>
</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0tag:blogger.com,1999:blog-3664503123885891673.post-89885311155823134972014-10-03T09:15:00.001-07:002014-10-03T09:15:24.712-07:00Semantic Web Updates to xEAC<div dir="ltr" style="text-align: left;" trbidi="on">
After having implemented better semantic web standards in other projects I'm working on that use Orbeon for the front-end, I have applied these changes to xEAC. At present, xEAC supports export of RDF/XML in three different models: A default archival-based model, CIDOC-CRM, and one that conforms to the SNAP ontology. All three are a proof of concept and incomplete.<br />
<br />
xEAC now supports the delivery of the xEAC default model in Turtle and JSON-LD, through both REST and content negotiation. URIs for record pages now accept the following content types through the Accept header: text/html, application/xml (EAC-CPF), application/rdf+xml (default model), application/json (JSON-LD), text/turtle, application/tei+xml, and application/vnd.google-earth.kml+xml (KML). Requesting an unsupported content type results in an HTTP 406 Not Acceptable error.<br />
<br />
For example:<br />
<br />
<code>curl -H "Accept: application/json" http://numismatics.org/authority/elder</code><br />
<br />
Furthermore, content negotiation has been implemented in the browse page. While Solr-based Atom results have been available through their own REST interface, you can now get them by requesting application/atom+xml. You can also get raw Solr XML back from application/xml. This might be useful to developers. I might implement the Solr JSON response, if there is interest (this would require a little more work).</div>
Ethan Gruberhttp://www.blogger.com/profile/14492799646719449654noreply@blogger.com0