An object to encapsulate properties and functionality related to a specific dbpedia item.
Parameters: |
|
---|
high-level type of resource; currently only supports person, place, and organization.
rdflib.Namespace for OWL (Web Ontology Language)
Client for interacting with DBpedia Spotlight via REST API.
http://wiki.dbpedia.org/spotlight/usersmanual?v=ssd
Parameters: |
|
---|
Call the DBpedia Spotlight annotate service.
All arguments other than text are optional; if default configurations were specified when the client was initialized, those will be used unless an overriding value is specified here.
Parameters: |
|
---|---|
Returns: | dict with information on identified resources |
Default base url for DBpedia Spotlight web service
number of API calls made
datetime.timedelta - total duration of all API calls
returns a cached property that is calculated by function f
Client for interacting with VIAF (Virtual International Authority File) API.
http://www.oclc.org/developer/documentation/virtual-international-authority-file-viaf/using-api
Query autosuggest API. Returns a list of results.
Search VIAF by local.corporateNames
Search VIAF by local.personalNames
Search VIAF by local.geographicNames
Query VIAF seach interface. Returns a list of feed entries, as parsed by feedparser.
Parameters: | query – CQL query in viaf syntax (e.g., cql.any all "term") |
---|
Annotate xml based on dbpedia spotlight annotation results.
Currently using logging (info and warn) when VIAF look-up fails or attributes are not inserted to avoid overwriting existing values.
When track changes is requested, processing instructions will be added around annotated names for review in OxygenXML 14.2+. In cases where a name was untagged, the text will be marked as a deletion and the tagged version of the name will be marked as an insertion with a comment containing the description of the DBpedia resource, to aid in identifying whether the correct resource has been added. If a recognized name was previously tagged, a comment will be added indicating what attributes were added, or would have been added if they did not conflict with attributes already present in the document.
When using the track changes option, it is recommended to also run meth:enable_oxygen_track_changes once on the document, so that Oxygen will automatically open the document with track changes turned on.
Parameters: |
|
---|
Annotate xml based on dbpedia spotlight annotation results. Assumes that dbpedia annotate was called on the normalized text from this node. Currently updates the node that is passed in; whitespace will be normalized in text nodes where name tags are inserted. For TEI, DBpedia URIs are inserted as ref attributes; since EAD does not support referencing URIs, VIAF ids will be used where possible (currently only supports lookup for personal names).
If recognized names are already tagged as names in the existing XML, no new name tag will be inserted; attributes will only be added if they are not present in the original node.
Parameters: |
|
---|---|
Returns: | total count of the number of entities inserted into the xml |
GeoNames flag: if true, annotate will convert dbpedia place URIs to GeoNames URIs when possible
Get the attributes to be inserted, based on the current document mode and the type of DBpediaResource.
Parameters: | res – namedropper.spotlight.DBpediaResource |
---|---|
Returns: | dictionary of attribute names -> values |
Get the name of the tag to be inserted, based on the current document mode and the type of DBpedia resource.
Parameters: | res – namedropper.spotlight.DBpediaResource instance for the tag to be inserted |
---|---|
Returns: | string tag |
OxygenXML track changes flag: if true, annotation will be tagged with OxygenXML track changes processing instruction, to enable review within Oxygen Author mode
VIAF flag: if true, annotate will convert dbpedia person URIs to VIAF URIs when possible
Annotate xml based on dbpedia spotlight annotation results. Assumes that dbpedia annotate was called on the normalized text from this node. Currently updates the node that is passed in; whitespace will be normalized in text nodes where name tags are inserted. For TEI, DBpedia URIs are inserted as ref attributes; since EAD does not support referencing URIs, VIAF ids will be used where possible (currently only supports lookup for personal names).
If recognized names are already tagged as names in the existing XML, no new name tag will be inserted; attributes will only be added if they are not present in the original node.
Currently using logging (info and warn) when VIAF look-up fails or attributes are not inserted to avoid overwriting existing values.
When track changes is requested, processing instructions will be added around annotated names for review in OxygenXML 14.2+. In cases where a name was untagged, the text will be marked as a deletion and the tagged version of the name will be marked as an insertion with a comment containing the description of the DBpedia resource, to aid in identifying whether the correct resource has been added. If a recognized name was previously tagged, a comment will be added indicating what attributes were added, or would have been added if they did not conflict with attributes already present in the document. When using the track changes option, it is recommended to also run meth:enable_oxygen_track_changes once on the document, so that Oxygen will automatically open the document with track changes turned on.
Parameters: |
|
---|---|
Returns: | total count of the number of entities inserted into the xml |
Attempt to auto-detect input file type. Currently supported types are EAD XML, TEI XML, or text. Any document that cannot be loaded as XML is assumed to be text.
Returns: | “tei”, “ead”, “text”, or None if file type is not recognized |
---|
Add a processing instruction to a document with an OxygenXMl option to enable the track changes mode.
Normalize whitespace in a string to match the logic of normalize-space() in XPath. Replaces all internal sequences of white space with a single space and conditionally removes leading and trailing whitespace.
Parameters: |
|
---|
Base class for namedropper command-line scripts.
Init method will initialize the argument parser, parse command-line arguments, check that file type is either specified or can be auto-detected, and execute run().
Parser is saved as parser, in case other script logic needs reference to it.
Initialize an argument parser with common arguments. Currently includes filename and input type.
Extend to add arguments.
Initialize an xmlobject based on user-specified arguments for filename and type. Returns an instance of the appropriate XmlObject, or displays an error message if the document could not be parsed as XML.
argparse.ArgumentParser instance to be initialized by init_parser() at class instantiation.
placeholder method - extend with script logic