getSchema’s RDFa Lite extractor is a REST web service to extract RDF [1] data from RDFa Lite [2] annotations and provide the semantic information as N-Triples [5] , N3 [3] and JSON [4]. This service is powered by node.js [9] and is using the jsdom [10] library.
Use our test form to try the service: http://getschema.org/rdfaliteextractor/test
The service endpoint is http://getschema.org/rdfaliteextractor
The following parameters are required:
rdf
, n3
and json
. Any other value is treated as invalid and the service will return an error.When missing any of the parameters the service will return an error.
The service allows only GET requests. Any other request type will return an error.
Test the service using this url: http://getschema.org/rdfaliteextractor?url=http%3A%2F%2Fgetschema.org%2Frdfalite2rdf%2Fexamples%2Freview.html&out;=rdf
Requests sent to the API endpoint must be HTTP GET requests, with all arguments sent as query parameters.
All arguments must be url-encoded (as per RFC 3986, [7])
rdf
, n3
and json
. Any other value is treated as invalid and the service will return an error.Consider the following HTML example and find below various possible service responses.
(See http://getschema.org/rdfalite2rdf/examples/review.html)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
<!DOCTYPE HTML> < html > < head > < title >Untitled</ title > </ head > < body > < span property = "name" >book review - Around the World in Eighty Days </ span > - by < span property = "author" > < span typeOf = "Organization" > property = "url" > Inside Books </ a > </ span > </ span >, < span property = "datePublished" content = "2007-08-26" >Sunday, August 26, 2007</ span > < div property = "reviewRating" > < span typeOf = "Rating" > < span property = "ratingValue" >5</ span >/ < span property = "bestRating" >5</ span >stars </ span > </ div > < div property = "itemReviewed" > < span typeOf = "Book" > property = "url" > Around the World in Eighty Days </ a > </ span > </ div > < div property = "description" > is a great book and comes from a writer with a solid canon of pacey adventure stories. But what you miss in the film and cartoon versions is the sheer scale of the effort of Jules Verne’s vision. He must have sat there with maps, guide books and numerous steamer and train timetables to be able not only to map the journey round the world but also factor in the numerous diversions that Phileas Fogg and company have to take. </ div > </ div > </ body > </ html > |
The out
parameter will be changed according to the desired output format.
An N-Triples [5] response is sent when the out
parameter set to rdf (out=rdf
). The headers use the Content-type text/plain.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
<_:gs0> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Review>. <_:gs0> <http://schema.org/name> "book review - Around the World in Eighty Days". <_:gs1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>. <_:gs0> <http://schema.org/author> <_:gs1>. <_:gs1> <http://schema.org/url> <http://insidebooks.blogspot.com/2007/08/book-of-books-around-world.html>. <_:gs0> <http://schema.org/datePublished> "2007-08-26". <_:gs2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Rating>. <_:gs0> <http://schema.org/reviewRating> <_:gs2>. <_:gs2> <http://schema.org/ratingValue> "5". <_:gs2> <http://schema.org/bestRating> "5". <_:gs3> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Book>. <_:gs0> <http://schema.org/itemReviewed> <_:gs3>. <_:gs0> <http://schema.org/description> "is a great book and comes from a writer with a solid canon of pacey adventure stories. But what you miss in the film and cartoon versions is the sheer scale of the effort of Jules Verne's vision. He must have sat there with maps, guide books and numerous steamer and train timetables to be able not only to map the journey round the world but also factor in the numerous diversions that Phileas Fogg and company have to take." . |
A N3 response is sent when the out
parameter set to n3 (out=n3
). The headers use the Content-type text/n3
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
@Prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> <_:gs0> rdf:type <http://schema.org/Review>; <http://schema.org/name> "book review - Around the World in Eighty Days"; <http://schema.org/author> <_:gs1>; <http://schema.org/datePublished> "2007-08-26"; <http://schema.org/reviewRating> <_:gs2>; <http://schema.org/itemReviewed> <_:gs3>; <http://schema.org/description> "is a great book and comes from a writer with a solid canon of pacey adventure stories. But what you miss in the film and cartoon versions is the sheer scale of the effort of Jules Verne's vision. He must have sat there with maps, guide books and numerous steamer and train timetables to be able not only to map the journey round the world but also factor in the numerous diversions that Phileas Fogg and company have to take.". <_:gs1> rdf:type <http://schema.org/Organization>; <_:gs2> rdf:type <http://schema.org/Rating>; <http://schema.org/bestRating> "5". <_:gs3> rdf:type <http://schema.org/Book>; |
A JSON response is sent when setting the out
parameter to json (out=json
). The response format follows Talis RDF-JSON [6]. It is a well formed JSON delivered using the Content-type application/json
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
|
{ "_:gs0" : { { "type" : "uri" , } ], "undefined//schema.org/name" : [ { "type" : "literal" , "value" : "book review - Around the World in Eighty Days" } ], "undefined//schema.org/author" : [ { "type" : "bnode" , "value" : "_:gs1" } ], "undefined//schema.org/datePublished" : [ { "type" : "literal" , "value" : "2007-08-26" } ], "undefined//schema.org/reviewRating" : [ { "type" : "bnode" , "value" : "_:gs2" } ], "undefined//schema.org/itemReviewed" : [ { "type" : "bnode" , "value" : "_:gs3" } ], "undefined//schema.org/description" : [ { "type" : "literal" , "value" : "is a great book and comes from a writer with a solid canon of pacey adventure stories. But what you miss in the film and cartoon versions is the sheer scale of the effort of Jules Verne's vision. He must have sat there with maps, guide books and numerous steamer and train timetables to be able not only to map the journey round the world but also factor in the numerous diversions that Phileas Fogg and company have to take." } ] }, "_:gs1" : { { "type" : "uri" , } ], "undefined//schema.org/url" : [ { "type" : "uri" , } ] }, "_:gs2" : { { "type" : "uri" , } ], "undefined//schema.org/ratingValue" : [ { "type" : "literal" , "value" : "5" } ], "undefined//schema.org/bestRating" : [ { "type" : "literal" , "value" : "5" } ] }, "_:gs3" : { { "type" : "uri" , } ], "undefined//schema.org/url" : [ { "type" : "uri" , } ] } } |
All service errors are delivered using JSON format. The following kinds of errors may occur:
Iframes are not loaded.
Scripts are loaded when the script
element is annotated with a typeof
attribute with the value http://schema.org/WebPageElement/Script
.
There might be other limitations regarding the triple extraction such as duplicates since we are still in beta.
This service is offered free of charge by http://binarypark.org
You must follow any policies made available to you within the Services.
We believe you will not misuse this service, rather may find it helpful. However, just in case:
Using this service does not give you ownership of any intellectual property rights related to the service or the content
you access. You may not use content from our Services unless you obtain permission from its owner or are otherwise permitted
by law. These terms do not grant you the right to use any branding or logos used in this service. Don’t remove, obscure, or
alter any legal notices displayed in or along with the service.
This service provides content that is not owned by the service provider. This content is the sole responsibility of the entity that makes it available.
The terms of use can change at any time and is not the provider responsibility to inform you.
More necessary information may be found at http://binarypark.org.
Would you be interested to learn more or to contribute to this service, please contact us at mtg(at)binarypark.org.
[1] Resource Description Framework (RDF), http://www.w3.org/RDF/
[2] RDFa Lite , http://www.w3.org/TR/rdfa-lite/
[3] Notation3 (N3): A readable RDF syntax, http://www.w3.org/TeamSubmission/n3/
[4] JavaScript Object Notation, http://json.org/
[5] RDF N-Triples Syntax, http://www.w3.org/TR/rdf-testcases/#ntriples but also http://www.w3.org/2011/rdf-wg/wiki/N-Triples-Format
[6] RDF-JSON Specification, http://docs.api.talis.com/platform-api/output-types/rdf-json
[7] Uniform Resource Identifier (URI): Generic Syntax (RFC3986), http://www.ietf.org/rfc/rfc3986.txt
[8] Web Application Description Language (WADL), http://www.w3.org/Submission/wadl/
[9] node.js http://nodejs.org
[10] jsdom – A JavaScript implementation of the DOM, for use with node.js, https://github.com/tmpvar/jsdom