python - Remove some words replace some other words from a txt file -
i have txt file (mytext.txt) containing many lines of text.
i know :
- how create list of word needs deleted (i want set words myself)
- how create list of word needs replaced
for instance if mytext.txt is:
ancient romans influenced countries , civilizations in following centuries. language, latin, became basis many other european languages. stayed in roma 3 month.
- i remove "the" "and" "in" replace "ancient" "old"
- i replace "month" , "centuries" "years"
you use regex:
import re st='''\ ancient romans influenced countries , civilizations in following centuries. language, latin, became basis many other european languages. stayed in roma 3 month.''' deletions=('and','in','the') repl={"ancient": "old", "month":"years", "centuries":"years"} tgt='|'.join(r'\b{}\b'.format(e) e in deletions) st=re.sub(tgt,'',st) word in repl: tgt=r'\b{}\b'.format(word) st=re.sub(tgt,repl[word],st) print st
Comments
Post a Comment