Additional file 9 - AluYa5 comparison with Chimpanzee genome

The file summarizes the blast2 result of the identified ortholog with the AluYa5 repeat and also gives the repeat masker annotation of the identified orthologous locus, along with N details.

OFC – Ortholog start coordinate, OLC – ortholog end coordinate, RFC – Repeat start coordinate, RLC – Repeat end coordinate

1. AluYa5_10_100

Ortholog annotation INDEL_PTS Length 6107 nscore 7.91 NPOSITIONS 3095 3577 ;

Repeat length (main genome) 308

Blast2 Results -

OFC OLC RFC RLC

11 61 11 61

254 287 240 273

779 1053 22 293

3606 3655 12 61

3749 3867 155 273

5660 5952 1 292

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1661 17.5 0.3 4.8 AluYa5_10_100 1 294 (5813) + AluJo SINE/Alu 1 281 (31) 1

2317 11.2 0.0 0.0 AluYa5_10_100 407 711 (5396) C AluSx SINE/Alu (7) 305 1 2

2124 10.7 0.7 1.4 AluYa5_10_100 759 1053 (5054) + AluSx SINE/Alu 2 294 (18) 3

2008 11.9 5.1 0.7 AluYa5_10_100 1197 1491 (4616) C AluSx SINE/Alu (2) 310 3 4

1998 27.2 24.4 2.1 AluYa5_10_100 1786 2309 (3798) C L2a LINE/L2 (297) 3129 2489 5

1811 18.7 0.3 0.0 AluYa5_10_100 2310 2609 (3498) C AluJo SINE/Alu (7) 305 5 6

1998 26.9 26.7 2.0 AluYa5_10_100 2610 3061 (3046) C L2a LINE/L2 (931) 2488 1925 5

1902 14.9 0.4 0.0 AluYa5_10_100 3595 3870 (2237) + AluJo SINE/Alu 1 277 (35) 7

1479 24.9 7.8 6.3 AluYa5_10_100 4049 4841 (1266) + L1MC4 LINE/L1 7198 8002 (40) 8

1827 25.5 15.1 3.8 AluYa5_10_100 4880 5659 (448) C L2a LINE/L2 (1443) 1976 1107 5

2432 8.9 0.3 0.0 AluYa5_10_100 5660 5963 (144) + AluSp SINE/Alu 1 305 (8) 9

1827 25.5 15.1 3.8 AluYa5_10_100 5964 6002 (105) C L2a LINE/L2 (2313) 1106 1064 5

______

2. AluYa5_10_117c

Ortholog annotation C_INTER_RMD_M_DISRUPTED Length 265 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 301

Blast2 Results -

OFC OLC RFC RLC

1 264 1 266

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2046 10.6 0.4 0.0 AluYa5_10_117c 1 265 (0) + AluSg SINE/Alu 2 267 (43) 1

______

3. AluYa5_10_20

Ortholog annotation INDEL_CAN Length 320 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 274

Blast2 Results -

OFC OLC RFC RLC

11 95 11 95

178 288 145 257

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1892 11.3 1.7 3.7 AluYa5_10_20 5 304 (16) + AluY SINE/Alu 5 311 (0) 1

1892 11.3 1.7 3.7 AluYa5_10_20 305 317 (3) + AluY SINE/Alu 281 292 (19) 1

______

4. AluYa5_10_28c

Ortholog annotation C_INTER_RMD Length 285 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 298

Blast2 Results -

OFC OLC RFC RLC

1 285 1 285

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1967 7.0 4.6 4.2 AluYa5_10_28c 1 285 (0) + AluY SINE/Alu 8 293 (18) 1

______

5. AluYa5_10_33c

Ortholog annotation INDEL_PTS Length 1241 nscore 18.78 NPOSITIONS 561 793 ;

Repeat length (main genome) 302

Blast2 Results -

OFC OLC RFC RLC

11 301 11 302

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2354 9.2 0.7 0.0 AluYa5_10_33c 1 306 (935) + AluSx SINE/Alu 2 309 (3) 1

1308 18.8 0.8 2.4 AluYa5_10_33c 315 560 (681) + AluJo SINE/Alu 1 242 (70) 2

22 0.0 0.0 0.0 AluYa5_10_33c 911 932 (309) + AT_rich Low_complexity 1 22 (0) 3

______

6. AluYa5_10_48c

Ortholog annotation C_INTER_RMD_M_DISRUPTED Length 123 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 309

Blast2 Results -

OFC OLC RFC RLC

1 123 1 123

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1065 7.3 0.0 0.0 AluYa5_10_48c 1 123 (0) + AluY SINE/Alu 1 123 (188) 1

______

7. AluYa5_10_50c

Ortholog annotation INDEL_CAN Length 1928 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 298

Blast2 Results -

OFC OLC RFC RLC

1 298 1 297

339 448 153 259

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2196 12.0 0.6 0.6 AluYa5_10_50c 1 311 (1617) + AluSx SINE/Alu 2 312 (0) 1

1056 14.8 1.1 1.7 AluYa5_10_50c 312 490 (1438) + AluSg/x SINE/Alu 128 305 (7) 2

36 0.0 0.0 0.0 AluYa5_10_50c 491 526 (1402) + AT_rich Low_complexity 1 36 (0) 3

690 26.3 26.3 2.5 AluYa5_10_50c 547 1098 (830) + HAL1 LINE/L1 1346 2029 (478) 4

389 29.9 0.8 2.5 AluYa5_10_50c 1523 1642 (286) + MIR SINE/MIR 110 227 (35) 5

______

8. AluYa5_10_54c

Ortholog annotation INDEL_CAN Length 1928 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 298

Blast2 Results -

OFC OLC RFC RLC

1 298 1 297

339 448 153 259

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2196 12.0 0.6 0.6 AluYa5_10_54c 1 311 (1617) + AluSx SINE/Alu 2 312 (0) 1

1056 14.8 1.1 1.7 AluYa5_10_54c 312 490 (1438) + AluSg/x SINE/Alu 128 305 (7) 2

36 0.0 0.0 0.0 AluYa5_10_54c 491 526 (1402) + AT_rich Low_complexity 1 36 (0) 3

690 26.3 26.3 2.5 AluYa5_10_54c 547 1098 (830) + HAL1 LINE/L1 1346 2029 (478) 4

389 29.9 0.8 2.5 AluYa5_10_54c 1523 1642 (286) + MIR SINE/MIR 110 227 (35) 5

______

9. AluYa5_10_58

Ortholog annotation C_INTER_RMD_M_DISRUPTED Length 133 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 307

Blast2 Results -

OFC OLC RFC RLC

1 123 1 123

20 53 154 187

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1048 6.9 0.0 2.3 AluYa5_10_58 1 133 (0) + AluY SINE/Alu 1 130 (181) 1

______

10. AluYa5_10_95c

Ortholog annotation INDEL_PTS Length 3320 nscore 0.30 NPOSITIONS 3311 3320 ;

Repeat length (main genome) 305

Blast2 Results -

OFC OLC RFC RLC

12 299 12 298

936 964 238 266

1009 1296 1 285

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2253 10.0 0.0 0.0 AluYa5_10_95c 1 299 (3021) + AluSx SINE/Alu 1 299 (13) 1

2813 16.0 3.6 1.2 AluYa5_10_95c 346 859 (2461) C LTR26B LTR/ERV1 (0) 531 16 2

814 15.2 0.0 0.0 AluYa5_10_95c 863 987 (2333) + FRAM SINE/Alu 30 154 (22) 3

1913 13.1 0.0 0.7 AluYa5_10_95c 1009 1320 (2000) + AluSx SINE/Alu 1 312 (0) 4

945 24.4 3.9 6.7 AluYa5_10_95c 1337 1578 (1742) C L1M5 LINE/L1 (3182) 2964 2728 5

1384 19.2 8.4 0.3 AluYa5_10_95c 1579 1865 (1455) C AluJb SINE/Alu (2) 310 1 6

945 24.4 3.9 6.7 AluYa5_10_95c 1866 1952 (1368) C L1M5 LINE/L1 (3419) 2727 2644 5

235 19.1 0.0 0.0 AluYa5_10_95c 2075 2116 (1204) C MLT1J LTR/MaLR (350) 162 121 7

2992 17.3 9.1 5.5 AluYa5_10_95c 2217 2909 (411) + LTR8A LTR/ERV1 1 718 (9) 8

372 24.6 11.4 0.8 AluYa5_10_95c 2936 3058 (262) + L2b LINE/L2 3029 3164 (211) 9

1256 9.6 2.1 0.0 AluYa5_10_95c 3062 3310 (10) C AluSq SINE/Alu (0) 313 121 10

______

11. AluYa5_10_98c

Ortholog annotation INDEL_PTS Length 2386 nscore 0.42 NPOSITIONS 2377 2386 ;

Repeat length (main genome) 309

Blast2 Results -

OFC OLC RFC RLC

91 142 11 62

238 525 12 298

1245 1275 22 52

1328 1357 102 131

1689 1733 12 56

1698 1730 155 187

1782 1969 103 288

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

796 18.2 0.0 0.8 AluYa5_10_98c 81 213 (2173) + FLAM_C SINE/Alu 1 132 (1) 1

2247 13.2 0.0 0.0 AluYa5_10_98c 227 537 (1849) + AluSx SINE/Alu 1 311 (1) 2

391 20.6 5.2 0.0 AluYa5_10_98c 548 644 (1742) + L1MEe LINE/L1 935 1036 (5083) 3

2152 7.5 4.6 0.3 AluYa5_10_98c 659 941 (1445) C AluSx SINE/Alu (17) 295 1 4

415 22.1 10.6 4.2 AluYa5_10_98c 944 1085 (1301) + L1MEe LINE/L1 1856 2006 (4143) 3

1832 14.5 3.2 0.4 AluYa5_10_98c 1227 1503 (883) + AluJb SINE/Alu 2 286 (26) 5

198 10.3 0.0 0.0 AluYa5_10_98c 1633 1661 (725) + (T)n Simple_repeat 1 29 (0) 6

1927 16.7 0.0 0.9 AluYa5_10_98c 1678 1992 (394) + AluJb SINE/Alu 1 312 (0) 7

25 0.0 0.0 0.0 AluYa5_10_98c 2095 2119 (267) + AT_rich Low_complexity 1 25 (0) 8

______

12. AluYa5_11_127c

Ortholog annotation C_INTER_RMD_M_DISRUPTED Length 127 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 308

Blast2 Results -

OFC OLC RFC RLC

1 80 1 80

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

983 9.4 3.1 0.0 AluYa5_11_127c 1 127 (0) + AluSg SINE/Alu 1 131 (179) 1

______

13. AluYa5_11_12c

Ortholog annotation INDEL_CAN Length 2297 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 331

Blast2 Results -

OFC OLC RFC RLC

1 296 1 298

537 827 10 298

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2126 12.0 0.3 0.7 AluYa5_11_12c 1 303 (1994) + AluSg SINE/Alu 2 303 (7) 1

2443 6.9 0.0 1.6 AluYa5_11_12c 527 835 (1462) + AluSg SINE/Alu 1 304 (6) 2

382 28.2 7.8 2.4 AluYa5_11_12c 1199 1287 (1010) + MIRc SINE/MIR 74 168 (100) 3

1639 14.8 5.3 3.0 AluYa5_11_12c 1288 1463 (834) C AluSx SINE/Alu (5) 307 127 4

189 0.0 0.0 0.0 AluYa5_11_12c 1464 1484 (813) + (TG)n Simple_repeat 2 22 (0) 5

1639 14.8 5.3 3.0 AluYa5_11_12c 1485 1607 (690) C AluSx SINE/Alu (186) 126 1 4

382 28.2 7.8 2.4 AluYa5_11_12c 1608 1684 (613) + MIRc SINE/MIR 169 249 (19) 3

22 3.5 0.0 0.0 AluYa5_11_12c 2269 2297 (0) + AT_rich Low_complexity 1 29 (0) 6

______

14. AluYa5_11_135c

Ortholog annotation INDEL_CAN Length 54 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 305

Blast2 Results -

OFC OLC RFC RLC

no hits found

Ortholog Repeat Masker annotation

There were no repetitive sequences detected in /home/vipin/WHOLE_GENOME_CG/AluYa5_CHR/Chimp/CONFIRMATION/AluYa5_INDEL_SEQUENCES/AluYa5_11_135c

______

15. AluYa5_11_152c

Ortholog annotation INDEL_PTS Length 189 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 299

Blast2 Results -

OFC OLC RFC RLC

no hits found

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1395 4.9 1.1 0.0 AluYa5_11_152c 2 186 (3) C L1P3 LINE/L1 (709) 5463 5277 1

______

16. AluYa5_11_160c

Ortholog annotation C_INTER_RMD_M_DISRUPTED Length 190 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 190

Blast2 Results -

OFC OLC RFC RLC

1 190 1 190

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1531 5.3 0.5 1.1 AluYa5_11_160c 1 190 (0) + AluY SINE/Alu 106 294 (17) 1

______

17. AluYa5_11_163c

Ortholog annotation C_INTER_RMD_M_DISRUPTED Length 120 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 298

Blast2 Results -

OFC OLC RFC RLC

1 79 1 79

88 120 97 129

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

928 7.5 7.5 0.0 AluYa5_11_163c 1 120 (0) + AluYg SINE/Alu 3 131 (180) 1

______

18. AluYa5_11_165

Ortholog annotation INDEL_CAN Length 978 nscore 1.02 NPOSITIONS 195 204 ;

Repeat length (main genome) 280

Blast2 Results -

OFC OLC RFC RLC

706 978 9 279

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2361 8.5 0.3 1.3 AluYa5_11_165 668 978 (0) + AluSg SINE/Alu 1 308 (2) 1

______

19. AluYa5_11_22

Ortholog annotation INDEL_PTS Length 1599 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 309

Blast2 Results -

OFC OLC RFC RLC

12 125 12 123

159 297 162 301

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

2231 11.3 0.6 1.0 AluYa5_11_22 1 314 (1285) + AluSq SINE/Alu 1 313 (0) 1

1273 11.4 0.0 1.7 AluYa5_11_22 1393 1571 (28) C AluSg/x SINE/Alu (10) 302 127 2

______

20. AluYa5_11_6

Ortholog annotation C_INTER_RMD Length 256 nscore 0.00 NPOSITIONS NA

Repeat length (main genome) 300

Blast2 Results -

OFC OLC RFC RLC

143 246 186 289

Ortholog Repeat Masker annotation

SW perc perc perc query position in query matching repeat position in repeat

score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID

1056 8.4 0.0 0.0 AluYa5_11_6 1 142 (114) + L1P1 LINE/L1 3608 3749 (2397) 1

900 8.1 0.0 0.0 AluYa5_11_6 143 253 (3) + AluY SINE/Alu 185 295 (16) 2

______

21. AluYa5_11_83