regex - PHP: Splitting a large string by certain characters in as large chunks as possible -


i implementing google translation api , take 5000 characters @ time, need split larger documents smaller ones , send multiple api requests.

i need therefore split content chunks long possible (but less 5000) , has been split, not in middle of sentence make translations difficult process google.

i therefore give method array of characters should when splitting.

  • </div>
  • </p>
  • </section>
  • </blockquote>
  • </br>
  • . (dot space)

what approach this?

regexp greedy default.

.{0,4980}(\<\/div\>|\<\/p\>|\<\/section\>|\<\/blockquote\>|\<\/br\>|\.\s) 

should give longest string ending 1 of delimiters.


Comments

Popular posts from this blog

html - Firefox flex bug applied to buttons? -

html - Missing border-right in select on Firefox -

python - build a suggestions list using fuzzywuzzy -