I have a concatenated string which I need to convert into snake case. For e.g;
converttobigendian -> convert_to_big_endian
Can this be achieved using a regular expression? If not, how can this be achieved?
The most robust and straightforward solution is to use the wordsegment library. It's built for this exact task and comes with a large dictionary derived from English text corpora, allowing it to calculate the most probable segmentation.
pip install wordsegment
Use it in your code: The library does all the heavy lifting. You just need to import it and call the segment function.
import wordsegment
# This loads the internal dictionary and language model
wordsegment.load()
input_string = "converttobigendian"
segmented_words = wordsegment.segment(input_string)
snake_case_string = "_".join(segmented_words)
print(snake_case_string)
# Output: convert_to_bigendian