Todkapuz Thread Counter v6

From Whuckaba
Jump to: navigation, search


Todkapuz Thread Counter (08SAIMO5) is a DOS based program intended to count and do minor statistics around the SAIMOE Tourny, orignially developed in 2006, still usable in 2008. Version 6 is a minor over-haul to the base programing. There are only minor changes from 5.98 to 6.00, but represents the beginning of next phase of programming.



Contents

Legal Stuff

Disclaimer

This program is provided “as is”. I make no warranties or guaranties, written or implied, for the use, or misuse, of this program. I do not recommend obtaining this program from any other source other than this website.

The use of this program is for entertainment purposes only.

As the program does not have access to a valid vote code record, this program will never be 100% accurate.

License

This program is provided as freeware for nonprofit and personal use, however the source code is not currently being provided as freeware. This program is not intended, nor licensed, for commercial usage. Please contact me for use within a commercial program.

Donations are welcome (contact me on AnimeSuki at Todkapuz).

Concept

Saimoe is an annual festival to share our favorite anime women with each other. It is intended to provide visibility to lesser known series, and to provide kinship for the more known series.

As part of the Saimoe festival, during the primary tourney, different anime women characters compete to determine who is the most Moe. This is done in a public forum (2ch.net) so the workings and support are all visible to all who partake.

The official scoring program uses several rules to validate votes. Primarily, the use of unique vote codes from unique posting addresses. While the valid vote codes are not released until after the polling is complete, the other aspects are available and clear to the public. This program uses those items to help remove “fake voting.”

This program is still subject to malicious fake voting, pre-voting, and admin corrections, so it will never be 100%. But it will remove most of the unwanted fake votes, and provide a reasonably close result table to use for various other purposes.

Operation

  • Obtain Saimoe.exe (or equiv named file)
  • Obtain thread sources... go to thread, make sure you are viewing the whole thread, right-click, view source (or similar such command). Then save resulting file in the same directory as this program (preferably name file in.txt, or similar).
  • Edit Match.txt based on configuration section of this wiki.
  • Run the program.

I know it's a bit more comlpex than that, maybe... but hopefully the examples will provide guidance.

I recommend that you download in.txt, match.txt, and this program to get a feel for it.

Configuration of Program

This program is intended to be configured from the match.txt file. Due to unforseen complications, the v6.00 program's match.txt is not compatible with previous versions of this program, nor is the v5.98 match60.txt forward compatible, dispite attempting to keep the same match file.

The file match.txt is required to run this program. match.txt is the default name, however this program does support drag-n-drop of match.txt of other names when using a short-cut to the program. See bug section.

Example of match.txt file: (update before release!)

-1
0
-2
-1
-1
0
0
-1
01:00:00
23:11:59
[[AS7-/Ria5eVO-HD]]
AS7-
-1
0
0
-1
-1
0
0
0
master.txt
master.txt
master.txt
1
in.txt
9
1,Kokoro
1,Ranka
1,Mayu
2,Mika
2,Siesta
2,Mikan
3,Horo
3,Asuka
3,Gekka
1,<<ŒjS—School Days ƒVƒŠ[ƒY>>
1,<<ŒjS>>
2,<<ƒ‰ƒ“ƒJEƒŠ[—ƒ}ƒNƒƒXFRONTIER>>
2,<<ƒ‰ƒ“ƒJEƒŠ[>>
3,<<ŒŽ‘º^—R—‚²D‚³‚Ü“ñƒm‹{‚­‚ñ>>
3,<<ŒŽ‘º^—R>>
4,<<ˆîXŒõi‚Ý‚©‚ñj—‚ª‚­‚¦‚ñ‚ä[‚Æ‚Ò‚  ‚܂ȂуXƒgƒŒ[ƒg!>>
4,<<ˆîXŒõi‚Ý‚©‚ñj>>
5,<<ƒVƒGƒXƒ^—ƒ[ƒ‚ÌŽg‚¢–‚ `‘oŒŽ‚Ì‹RŽm`>>
5,<<ƒVƒGƒXƒ^>>
6,<<Š‹é‚Ý‚©‚ñ—ƒŒƒ“ƒ^ƒ‹ƒ}ƒMƒJ>>
6,<<Š‹é‚Ý‚©‚ñ>>
7,<<ƒzƒ—˜T‚ƍh—¿>>
7,<<ƒzƒ>>
8,<<ç–ì–¾“ú‰Ä—ƒLƒ~ƒLƒX pure rouge>>
8,<<ç–ì–¾“ú‰Ä>>
9,<<—èŒŽ—‹¶—‰Æ‘°“ú‹L>>
9,<<—èŒŽ>>

The various line controls are detailed below:

Program Configuration

Line 1 of Match.txt

  • Configme(0) - Use File Options
  • This switch determines if it will force the usage of configuration file values. '-1' will use the settings in this file, '0' will allow real-time configuration for some, but not all configurations values. I recommend '-1'.

Line 2 of Match.txt

  • Configme(10) - Memory Usage
  • This switch determines if large volume data is compiled on the hard-drive or in memory. Only file caching is currently deemed stable. SEE MEMORY USAGE WARNING.
  • '0' = file [stable], '-1' = memory for VOTE CODES, '-2' = memory for vote codes and post IDs.
  • Using memory for sorts saves 10-30% on time based on my study. If program crashes unexpectedly, go back to file sorts.

Visulation \ Operation Mode

Line 3 of Match.txt

  • throttle & Configme(1) & Configme(2)
  • This is the most enjoyable part for me. Since the processing has slowed down the processing, this allows you to get a visual representation of what is happening. Time in seconds is based on a test run for Pre-release Beta 5.98 of each mode performed 09/21/2008.

No Interaction

  • 15 Seconds in File Mode, 9 Seconds in Memory Mode
  • Realtime Selection Code: Not Available in Real-Time Selection
  • File Selction Code: -2
  • Basically doesn't show anything other than where in the timeline it is in processing. No visual goodies.
  • Does NOT stop for user interaction. If you only care about getting the results in the files.

Throttled Up

  • 15 Seconds in File Mode, 9 Seconds in Memory Mode
  • Realtime Selection Code: T
  • File Selction Code: -1
  • Basically doesn't show anything other than where in the timeline it is in processing. No visual goodies.

Original Mix

  • 35 Seconds in File Mode, 20 seconds in Memory Mode
  • Realtime Selection Code: 0
  • File Selction Code: 0
  • This will show last time the chara got a vote, if it was a vote tis time, and total votes, as they are collected. This is the original visual information in pre Version 5.0

BarGraph w/o Fakes

  • 35 Seconds in file, 20 seconds in Memory Mode
  • Realtime Selection Code: 1
  • File Selction Code: 1
  • This will show all items as the Original mix, with a visual bar graph representation of where the votes are, and percentages. Very cool to see streaks late in a match that alter the playing feild. Bar graphs scaled based on highest percentage. Note, percentages are of ALL votes, not by match, which might be added in the future.

Bar Graph w/ Fakes

  • 35 Seconds in file mode, 20 seconds in memory mode
  • Realtime Selection Code: 2
  • File Selction Code: 2
  • Same as above, but also adds in the graph for fake votes in red. In a fair match, ths fake votes probably will not even be enough to show... but in some matches it is impressive.

Waterfall

  • 36 Seocnds in file mode, 22 seconds in memory mode.
  • Realtime Selection Code: 3
  • File Selction Code: 3
  • Significantly changed in version 5.9.
  • This is my favorite method. It shows a graphical representation of the votes. Shows when streaks are occuring, and where fake votes are. If you really want to get a feel fore the match, this is the way to go.

Fake Vote Removal Options

Line 4 of Match.txt

Duplicate Vote Codes

  • ConfigMe(3) - Duplicate Vote Code Checking [was Configme(5) in v5.98]
  • '-1' for Remove Fakes by Double Vote Code, '0' for not. '-1' is recommended.
  • Major change in Version 4.5
  • Using this necessitates Duplicate Post IDs checking and No Vote Code checking.
  • This is to fight a fake vote method that was appearing where people simply copied other people's posts. Saimoe rules dictate that only one Vote Code can be used per valid vote (except for corrections, which is beyond the scope of this program at this time.).

Line 5 of Match.txt

Duplicate Post IDs

  • ConfigMe(4) - Duplicate Post ID Checking [was configme(6) in v5.98]
  • '-1' for Remove Fakes by Double Vote Code, '0' for not. '-1' is recommended.
  • Major change in Version 4.5
  • Can not be used without Duplicate Vote Code Checking active.
  • Saimoe rules dictate that only one valid vote can come from each ID (even though it actually is possible for 2ch to assign two people the same Post ID in a 24 hour period.)

Line 6 of Match.txt

Reserved

  • Configme(5)

Line 7 of Match.txt

Reserved

  • Configme(6)

Line 8 of Match.txt

Match Time Control

  • Configme(7) - Match Time Control [was configme(8) in v5.98]
  • '-1' for only counting those in the appropriate time block, '0' for not. '-1' is recommended.
  • Major corrections in 6.00
  • Note, in waterfall these will show as 'fake', but does not affect 'fake' vote totals in straight results.

Line 9 of Match.txt

  • Related to Line 8 (required regardless of line 8 value)
  • Start of Match, XX:XX:XX format, or 0. 0 = default 01:00:00

Line 10 of Match.txt

  • Related to Line 8 (required regardless of line 8 value)
  • End of Match, XX:XX:XX format, or 0. 0 = default 23:00:59

Line 11 of Match.txt

  • This is an example of a proper vote code. In 2007, there was a match where the vote code lengeth changed, causing issues with the program, thus the introduction of this in the match file. It should include the double brackets.

Line 12 of Match.txt

  • Prefix code. This is to verify the code is one that is used for this match. Changes each day.

Analysis

Line 13 of Match.txt

2-Way Cross, Full

  • Configme(8) - 2way Cross Full
  • '-1' to compute 2way cross full, '0' to not. '-1' recommended for matches with 2 votes.
  • The Beta 5.9 is a rework of the Beta 5.5 system (desigend for 2 girls per match, 2 matches) for the whole 9 girls. Note, this is still 2-way picking. So if you sum a row and column, it will not match the total votes for that girl (should be higher). This is because if someone picked girl set 1-4-7 ... that is seen to the program as 1-4, 1-7, and 4-7 ... all 3 of the 2-way crosses. Thus, if summed, you'd have 2 1s, 2 4s, 2 7s. Numbers in brakets are votes cast ONLY for a given character, with no cross (ie, they didnt vote in any other match). 2-way cross is still primarily intended for 2 matches at a time, not 3, but still can provide interesting popularity information. This data should only be used for that type of analysis... ex: "People who liked 1, also seemed to vote for 7." etc.
  • Results are on-screen and SM2WAY.TXT

Line 14 of Match.txt

2-Way Cross, Match (Not Implemented)

  • Configme(9) - RESERVED

Line 15 of Match.txt

3-Way Cross, Full (Not Implemented)

  • Configme(11) - RESERVED

This is currently under development. The difficulty is there are approximately 252 possible 3-way crosses (full), + 36 2-ways + 9 1-ways that total 297 data points. This is not available in Beta 5.9

Line 16 of Match.txt

3-Way Cross, Match

  • Configme(12) - 3-Way Match
  • '-1' to compute 3way cross full, '0' to not. '-1' recommended for matches with 3 votes.
  • Version 5.98. The sysetm will allow 4 x 4 x 4 match runs (64 options). But note, it will throw away anything that is NOT a 3-way match, or where two from the same match are chosen.
  • Results are on screen and in SM3WAYM.TXT.

Line 17 of Match.txt

Post Numbers

  • Configme(13) - Post Validation Numbers
  • '-1' to compute validation file. '0' to not. '-1' is recommended for testers compairing against data mining.
  • For Beta 5.9 this is located in the Saimoe.log file. This is the same type of result the actual processing unit uses to show what votes counted. This is intended to serve as both a way to see if it counted your vote, as well as a chance to compare against the ACTUAL results once released, to look for votes that were missed or counted by the program that were or were not real.
  • Results on-screen and in SMVALI.TXT

Line 18 of Match.txt

  • Configme(14) Reserved for future use. Should be '0'.

Line 19 of Match.txt

  • Configme(15) Reserved for future use. Should be '0'.

Line 20 of Match.txt

  • Configme(16) Reserved for future use. Should be '0'.

File Configuration

Line 21 of Match.txt

  • Reserved. Should be 'MASTER.TXT'

Line 22 of Match.txt

  • Reserved. Should be 'MASTER.TXT'

Line 23 of Match.txt

  • Reserved. Should be 'MASTER.TXT'

Line 24 of Match.txt

  • Total Thread Source Files to open. Should be more than 0.

Line 24a of Match.txt

The actual files to open for the input source (former in.txt), one each line, as many as specified in line 24.

NOTE: MUST BE IN DOS FORMAT! That means that it can only be 8 character file name, dot, 3 charector extension. If the file is in another directory, directory names must be the 8-char version.

So I highly recommend that you keep the input in the same directory as the program, and use file names like:

  • in.txt
  • in2.txt
  • in3.txt
  • 080902a.txt

Charater Configuration

Line 25 of Match.txt

Number of charactors to parse. This number must be between 1 and 9. While it is known that are matches that had 10 total chara, this is rare, and already streaches the limits of the DOS enviornment. If someone can provide a full set of a 10-chara match threads to me, I might be able to work on expanding to this.

Lines 25a of Match.txt

The next (Line 25) number of lines are for the charector naming convention and match. Format:

MATCH,NAME

One Character Name per line. These are the names of the characters in the match. Do to software limitations, ROMAJI (English) is recommended. Use of other formats has not been attempted. MATCH is either 1, 2, or 3. For the lines that follow this, charactor 1 is the first line, charactor 2 is the second line, etc.

All remaining lines of Match.txt

The following lines (up to 100 total) are match templates. in format

#,LOOKUP-VALUE

Where the number refers to the charactor it is to credit, and the item is the thing to be searched for. Note, 2ch uses Shift-JIS, and this program uses 8-bit ASCII. So in this file it will look somewhat like jibberish. I find that verifying it by copying the match file and renaming to an .htm and then opening with a web broswer capable of viewing. The use of the most recent version of Thread-Count-Light can also aid in this process. Current Alpha has experimentation of auto-discover that will help this process.



Notes

It is important that the "EOF" is present at the end of a line, not an empty line. If it is on an empty line, the program can crash. If the number in line 1 does not match the charactors actually listed, the program may crash, or produce unusable results.

Other Files Related To This Program

in.txt (or equiv)

  • Required to run.
  • File name is set in match.txt file, so this may be whatever is selected.
  • Provided by user from 2ch.
  • File may have other names in the version 5.94+ multi-file support mode (for improved ease).
  • When viewing a COMPLETE 2ch thread, right click and 'view source'. Save source, and rename to in.txt (or similar). Basically all this is. For multiple threads, for version 5.93 and before, they must be copied into one file (end to end). Version 5.93 and before CAN detect the thread changes, and will produce a validation code similar to the actual validation file. Version 5.94 and above supports multiple input files. NOTE, however, they must include the FULL source code of each page, or it might not detect thread changes... so make sure you're viewing '1-1000'

SAIMOE.LOG

  • Generated by program on run. Automatically over-written each run.
  • Is mostly inteded for debugging, but does include some useful information, mostly around the actual processing of the entries. It will let you know about all the fake votes it encountered, and why it was removed from consideration.

MASTER.TXT

  • Generated by program on run. User selectable on if it is appended or over-written.
  • It can be changed to .csv and opened with just about any spreadsheet program, like Excel.
  • TIME = Time of the POST
  • THD_POST = post code, similar to how the actual system works. Thread #, then 3 digit post nubmer.
  • POST_ID = Poster's id according to 2ch.
  • POST_CODE = Poster's code, hopefully as issued by the admin of the tourny.
  • Following are in vote, total vote format. Note, total votes are based on thread position, so not as useful if you are going to graph. One of my future projects is to create a stats version that will make a very nice time versus votes unit to make this quicker to graph.

CODELIST.LOG

  • Mostly a debug thing. This shows all the post codes the system encountered. Is re-written each time the program is run.

IDLIST.LOG

  • Mostly a debug thing. THis shows all the ID codes the system encoutnered. Is re-written each time the program is run.

SMRESL.TXT

  • Program Generated. New to 5.9
  • This is just the text results output.
  • Name is first
  • Total votes is next
  • Numbers in paran are the votes by each detection id
  • Numbers in brackets are fake votes tossed for that chara.

SM2WAY.TXT

  • Program Generated. New to 5.9
  • This is the results from the 2-way calculation. See Analysis Section.

SMVALI.TXT

  • Program Generated. New to 5.91
  • This is the output from the code validation routine (still in log file though).

SM3WAYM.TXT

  • Program Generated. (New to 9.56)
  • This is the 3-way match crossing, across matches only. First set is by match, second set is listed from higest to lowest.

Future Files

The 3-way crossings (both variants, hopefully) will be logged



Change Log

6.00 Beta (from 5.98 Beta)

  • Append and Header Options have been removed. These are no longer needed thanks to multiple input files. References to configme(3) and configme(4) have been removed from the program, so can be applied to other uses.
  • No Vote Code has been removed. Obviously it really doesn't serve a reasonable function. Saimoe rules dictate you have to have a code to vote, so it's silly to even bother.
  • While there was a place to turn checking time off, it couldn't actually be turned off. That is corrected.
  • Those who voted before start or after end were counted as 'fakes' ... while they were not penalized (they could vote again and it would accept), it increased fake vote tallies unnecessarily. Note, water-fall will still show them as 'fake' ... and future graphing subroutine may do the same.
  • Last line of SAIMOE.LOG now contains total run time.
  • Quieted Down 3-way stat match... seems to work acceptably for me... much less debug info now saved to saimoe.log
  • Now can turn on and off what every analysis parts that you fancy (except straight results, as I see no point to turn that off).
  • oh yeah.. and match.txt is totally messed up from match60.txt, ne?

For Previous Versions

In the Works

  • Add MASTER2.TXT that includes fakes
  • Sort Master2.txt for graphing purposes.
  • Change the Results to work with Match data, now that such data is available
  • Rework detection routine to allow for upgrade below. (This will be, hopefuly, unnoticable to user)
  • Do "best guess" matching, to ease work on front-end coding by user
  • Incorporate TCL into this program to aid in making the match60.txt file
  • do 2way that sorts out highest to lowest.. just to be nice. (do with 2way match)
  • match60.txt is going to have to change.... either combining the validation part or something, so that I can do adaptive and do selective results.

Bugs

  • Drag and Drop usage forces "?:\documents and settings\USER\" to be the default start location... so MUST use shortcut to get this to operate correctly.
  • If a user votes missing one of the matches, then votes again, without changing other two votes, then addition is accepted? ... make sure this is true of actual program, then see if I can find a crazy way to implement into my system. (have to implement matchf.txt to really do this).
  • MEMORY WARNING: This program is confined by DOS memory allocations. Program may crash unexpectedly if running with memory utilization. If this happens, change memory setting to more conservative (or just file).

What you Really Want

IMPORTANT - RIGHT CLICK, SAVE-AS ... otherwise the .txt will just load in this window. Save to any directory (preferably one just for this program).

CURRENT

  • [match.txt] - EXAMPLE - UPDATED FOR 6.00
  • [in.txt] - EXAMPLE - UPDATED FOR 6.00


HISTORICAL