![]() So like I said, it is fairly straight forward as to how to find matches, but This would prevent a false match to something else in the future. You would replace these same characters with something like ".". Sequence in the natural order as it would be found in the DNA sequence,Īnd either know where its starting position would be, or replace eachįound sequence in the DNA string with an equal number of spaces or This problem would then mandate that you have to identify and process each ![]() Of which is first, five followed by six, or six followed by five, wouldĬause a difference in the boundary position between the two. Sequence is five characters, another is six characters, then the order If coded sequences could be of variable length, then you cannot be sure The first find and on to any subsequent one if you have to cope with You would have to work with an offset in FindString() to work beyond To ensure any match begins on a boundary, such as 1, 6, 11, 16, etc. If the sequences you are looking for are allįive characters in length, then you could use the MOD(offset,5)=1 test That say that each found sequence must be positioned on a certainīoundary to be valid. Whether overlapping sequences are valid - it may be bound by rules But the DNA sequence may have its own rules as to GATAC, beginning at the first character, and CGCGT, beginning at theĥth character. Sequence that was coded GATACGCGT, you would get a match for both Two example sequences above, if you found a section of the DNA Sequence and the leading part of the other. That is, two sequences appear to both be in in the DNA sequence, butĪ portion of the DNA sequence appears to be the trailing part of one ![]() One problem to recognize is the possibility of overlapping sequences. You find you have to put the DNA sequence into a memory block, youĬan use the CompareMemoryString() function instead. Value indicates its offset from the beginning of the DNA$ string. IF the value returned is non-zero, the sequence was found, and the Space to the end of the line, then using FindString(DNA$,Sequence$). Sequence file one line at a time, discarding everything from the first Searching for one of your sequences is then just a matter of reading the Will have to find another way to hold the sequence, possibly in an array PureBasic has a maximum string length for any one string of just overĦ4,000 characters, the DNA sequence cannot be longer than that, or you ![]() Static, or could be read from the same file, or a different one. The DNA sequence can either be embedded into the program if it is Your program can easily process the whole DNA sequence. The contents of this file whenever you want. The format permits any combination of characters. You create a text file with one sequence per line, such as: I could even sort of understand 32-bits, if you want to make it unicode compatible.Since you are talking about 8000 or so sequences, I would suggest that How do I set it to one byte/char as default? This seems grossly wasteful to use a full 64-bits for an 8-bit character. Why does each character take more than one byte? (Heck, why more than TWO?) I'm not set to use Unicode (I'm on Windows). That means 4 bytes per character, plus 4 bytes for the zero. Strings only grow.ĭebug "Start at " + "Array at " + "next at " + "end at " + StringByteLength(bstring)ĭebug "astring(0) Length" + Str(StringByteLength(astring(1)))ĭebug "element 0 bytes " + - no initializing.size = 0Īstring(0) = "1" changing does NOT effect size.ĭebug "astring(0) Length " + Str(StringByteLength(astring(1)))ĭebug first change.still 16 bytes at original locationĪstring(1) = "e234" 4 causes increase (3 does not)ĭebug "element 1 bytes " + - no initializing.size = 0 (Use 1 because 0 is "managed" into previous location.)Īstring(1) = "a12345678902" more than 11 and it increasesĭebug 3 characters take up 16 bytes, but 4 characters consume 24 bytes. Dim astring.s (9) String lengths are totally managed by PB. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |