Fastest way to extract part of a long string in Python -
i have large set of strings, , looking extract part of each of strings. each string contains sub string this:
my_token:[ "key_of_interest" ], this part in each string says my_token. thinking getting end index position of ' my_token:[" ' , after getting beginning index position of ' "], ' , getting text between 2 index positions.
is there better or more efficient way of doing this? i'll doing string of length ~10,000 , sets of size 100,000.
edit: file .ion file. understanding can treated flat file - text based , used describing metadata.
how can can possibly done "dumbest , simplest way"?
- find starting position
- look on ending position
- grab indiscriminately between two
this indeed you're doing. further inprovement can come optimization of each step. possible ways include:
- narrow down search region (requires additional constraints/assumptions per comment56995056)
- speed search operation bits, include:
- extracting raw data format
- you did disregarding format altogether - have make sure there'll never incorrect parsing (e.g. search terms embedded in strings elsewhere or matching part of token) per comment56995034
- elementary pattern comparison operation
- unlikely attain in pure python since
str.indeximplemented in c , implementation simple can possibly be
- unlikely attain in pure python since
- extracting raw data format
Comments
Post a Comment