Recurrent Neural Network writes Semantic Web Specification

The other day danbri brought Recurrent Neural Networks to my attention, linking to the post Lisls Stis: Recurrent Neural Networks for Folk Music Generation. Now although I’ve always had one eye on AI stuff, I missed RNNs. They’re pretty amazing, check The Unreasonable Effectiveness of Recurrent Neural Networks.  I’ve been meaning to get back into signals a bit (FFT etc) especially in the context of neural nets, but despite having had a recent nosey around Octave (open source Matlab-like kit) hadn’t really bitten. You can throw signals at RNNs and they’ll learn stuff.

The tools mentioned behind the RNN links above are based on the Lua language, specifically using Torch (leading to a setup not unlike Octave/Matlab):

Torch is a scientific computing framework with wide support for machine learning algorithms. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.

Torch was relatively straightforward to set up (on Ubuntu), except I needed one tweak to the script install.sh – comment out -DWITH_LUAJIT21=ON (around line 41) before running.

There’s code for running an RNN on text on GitHub, char-nn.

For a play, I snagged the W3C semweb specs:

wget \
     --recursive \
     --no-clobber \
     --page-requisites \
     --html-extension \
     --convert-links \
     --restrict-file-names=windows \
     --domains www.w3.org \
     --level=1 \
         http://www.w3.org/2001/sw/Specs

(manually deleting the things that weren’t HTML spec docs)

Concatenated them together:

cat $(find . -name \*.html) > input.txt

producing a 17MB file, then passed to the RNN with the following:

th train.lua -data_dir data/w3c -gpuid -1 -rnn_size 128 -num_layers 3 -dropout 0.2  -max_epochs 20 -eval_val_every 200

3 layers, 128 nodes, pretty much guessed. I don’t have a suitable GPU (-gpuid -1) so set it running with nice=5 and left it to it. It quite quickly learnt the general shape of words and markup, though it was a few hours before it got remotely intelligible. It would have run for a week but crashed some time this morning, after 3 days (10 epochs), dunno why :

saving checkpoint to cv/lm_lstm_epoch10.91_1.3082.t7    
/home/danny/torch/install/bin/luajit: /home/danny/torch/install/share/lua/5.1/torch/File.lua:157: write error: wrote 206438 blocks instead of 446773 at /home/danny/torch/pkg/torch/lib/TH/THDiskFile.c:314

[PS. was out of disk space 🙁 ]

– but fortunately it was saving regular ‘checkpoint’ files so there was a recent one from which to generate some new material. I had to delete 3 little errors to get it through HTML Tidy so it wouldn’t mess up embedding here (I ignored the 96 warnings). Viewed directly in a browser the layout isn’t so scrunched.

So here is the result:


for all representations for expressed in an instance in
attribute languages version in the HTML has a subject extensions
contains a conditions,
in a starthor given encoding ovalue name. For example, the
association that derived truth mappings by not options. What ‘
OWL-RIF-Core Content )


1.4 Favigation Some Include,
2009.
AGMAGEO.

Set defined
it from
http://www.w3.org/TR/xmlschema11-2/#bib-PROV-OTINTERM">
2.4.3 link query
specification.


1.3
5.4.5.1 In policy after
uncertainty, of
The
PROV requirements on schemas are licter
:
definedending
parts
extends define the activity views and after to a
subject, and the document.

NPVE
pages
or parsing for position is used in which activity
case, and three other files, the working value, the I+ to
the document that class the IRI class representations
conganted through produces. Recommendation.

The Test Summary Garsiin Are knowledge formulas (Drecord)]
vocabulary application of block can be alternated from OWL 2 RL
is the can be sets all including the RDF table of a syntax, the
Recipe 1.0 extensions to RDF graph(functional programmar.

5.2 Data Target Accessible Data
enumber 2013

This deployment base own this extendes from a is pattern and
associated within the
wl:Responsion

Glieinwes POWDER on a document (([a62 41]

evend from this is each can be well-itement for but
similary point import for etc. Other applye convention vocabulary
is the specification. Having that specified in
rdfs:Note is made comporents ambeld, by process:Any shown is a standards/The dictionary may be method, the it hech in a sequence links
red to have specifies.

Benone between activity of the following publication which
allows a services of uy of maps replaces to readers multiple
primer. Because 5 . search model to be image that the value cade
be tust allows an these aolul to be existent, the brafel well
signs kacynvermation of record it an OWL 2 Transformation of the
worked and strictors is describeds by for structurals.

Into by ontology, the definition, access to is the simple and
the reminold language in reasoning vocabulary structure,
requirement Herta (C9) for numbers was schema that is change
axiomatic element modules of below is ontology test or an
individual of the group.

Shown a document concepts to other wuse bind less would a
attributes that survial used by have specification, and
corresponding with completes of a value in Query will node
limiting the GRDDL is not publishing of a
members to a named as service have an import of its that
the languages, logical meaning to the reference then exist to a
veriic with query syntax of the review of the describes the
variable component identifier, whore indicate diseose as
vocabulary length two all example.com harding OWL 2 RDF concept
to the
Datamases".

This and a relationship from the page. The following data and
document, and
page accessed also. As a skos no restlations of to childen
river that counce zeromatization of default is not that the
argument
rdfs:comment on a linked defined sequences of not
name. It as non-normalizations pricitally.

RDF
Syntax

  • current-about
  • The

Resources of the URI Liret, Results

prov:documentation(on), described by a keywhich
as properties is the glan delain a specification string to with
carlization of event below. All simple way.

time
<http://www.w3.org/TR/xmlch_media-public-mediafor/

  • Generative data the predicates are component are urlicroded
    of the

    Type
RIF-FLD
Semantic Web Goods are documents to the level 1.1

2.6.4 Certer 8
Derived in the value of
books.

border: 3 820d84420.verse irclassity
xsd:group used list with the have is can be used using
the location or the sime bulting language, the function would
call is a directly as the framefore of the following production
of the following returns function between both the formal
mapping using the RDF class as XKOAL formally approved by the
group”>Working Draft
<Const type=”Mapping”>
<Var>lt</Var> </Frame> </Const>
</slot> </slot> <Frame>
<Var>p</Var> </const> </attribute>
</Frame> </Frame> </Frame> <Annotation>
</object> </formula> <formula”
open an RDF

danny

1 Response

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment