Spacy NER Example
1. In your virtual env..install spacy
pip install --upgrade spacy.
2. Install Jupyter too,...as visualization is handy
python -m pip install jupyter
3. Load default model for spacy
python -m spacy download en
4. Invoke Jupyter
jupyter notebook --no-browser --NotebookApp.token='' --ip='*'
5.
Updating existing model to include a NER. Uber is not detected by default model. Let's add it.
-----------------------------------------------------------------------------------------------NoteBook--
import random
from spacy import displacy
nlp = spacy.load('en')
train_data = [("Uber blew through $1 million", {'entities': [(0, 4, 'ORG'),(17, 28, 'MONEY')]})]
for text,_ in train_data:
doc=nlp(text)
displacy.render(doc, style='ent', jupyter=True) # If u use display.serve it will try to serve at port 5k
# We see Uber is not picked up...let use the train_data to update the model
with nlp.disable_pipes (*[pipe for pipe in nlp.pipe_names if pipe != 'ner']):
optimizer = nlp.begin_training ()
for i in range (10):
random.shuffle (train_data)
for text, annotations in train_data:
nlp.update ([text], [annotations], sgd=optimizer)
#Save the Model
nlp.to_disk("/nlp/model5")
-------------------------------------------------------------------------------------------Client
import spacy
import random
from spacy import displacy
nlp = spacy.load('/nlp/model5')
text1="Uber brought in $2.8 billion in revenue in the second quarter of 2018, but ultimately lost $891 million thanks" \
"to increased spending by the ride-hailing company, according to Bloomberg."
text2="Ride-hailing company Uber is still on track to book more than $10 billion in revenue this year".
foo22=nlp(text1)
displacy.render(foo22, style='ent', jupyter=True)
foo23=nlp(text2)
displacy.render(foo23, style='ent', jupyter=True)
P.S:
If you use displacy.serve(doc, style='ent') serving on port 5000... using the 'dep' visualizer takes long..
Use displacy.render(foo22, style='ent', jupyter=True) instead.
P.S:
If you use displacy.serve(doc, style='ent') serving on port 5000... using the 'dep' visualizer takes long..
Use displacy.render(foo22, style='ent', jupyter=True) instead.
Comments
Post a Comment