i am getting a trouble finding an approach to solve this problem....
input-output sequences are as follows
input1 : aaagctgctagag output1 : a3gct2ag2
input2 : aaaaaaagctaagctaag output2 : a6agcta2ag
input nsequence can be of 10^6 characters and largest continuous patterns will be considered. For example in input2 "agctaagcta" it will not be agcta2gcta but it will be "agcta2".
any help appreciated.