Total N00b question about entities (python) #11712
-
I'm sure anyone on here can answer this question but I'm new to python so could really do with a bit of help :) Basically I want to extract and remove all labels except the entity example from PERSON i just want to add all the names (nothing else) to list so that I can remove any duplicates. I can add the entities to list but they still have the labels attached and I cant then remove the duplicates example of my code that does look like it works but dose not. people = []
places = []
company = []
def spacyents():
spacy.prefer_gpu()
nlp = spacy.load("en_core_web_lg")
doc = nlp(testdata())
for ent in doc.ents:
if ent.label_ == "PERSON":
people.append(ent)
if ent.label_ == "ORG":
company.append(ent)
if ent.label_ == "GPE":
places.append(ent)` |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi @skintflickz , perhaps you're looking for for ent in doc.ents:
if ent.label_ == "PERSON":
people.append(ent.text) |
Beta Was this translation helpful? Give feedback.
-
sorted |
Beta Was this translation helpful? Give feedback.
Hi @skintflickz , perhaps you're looking for
ent.text
? It gives you astr
that you can use to remove duplicates (e.g.,set(people)
).