btw this Wiki is not meant for a wider audience yet

File:Eam charfreqanalysis single-letters.png

From HFGCS Wiki
Revision as of 03:04, 2 December 2025 by Breadpan (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Original file (1,492 × 788 pixels, file size: 60 KB, MIME type: image/png)

Summary

An analysis of how often alphabetic characters appear in EAMs. This was generated by eam_characterfreqanalysis_.py, which is freely available on NEET INTEL's GitHub, and an (at time of running) unreleased EAM dataset.

CODE
%Run eam_characterfreqanalysis_.py

Successfully loaded 12247 rows from SET-ALL Found messages with forbidden digits (0,1,8,9) on dates: 2022.01.29, 2025.06.10, 2025.06.11, 2025.11.18 Filtered out 13 messages containing digits 0, 1, 8, or 9 Filtered out 528 rows with backslashes in Q column Filtered out 302 rows with backticks in SOURCE column Filtered out 42 rows with underscores, periods, or question marks in MESSAGE column Removed 4947 duplicate messages Final dataset: 6415 unique messages

What would you like to analyze? Enter a NUMBER (1, 2, 3, etc.) to analyze digit sequences of that length Enter any LETTER to analyze alphabet frequencies (a-z) Enter any 2-CHARACTER combination (AB, 23, X7, etc.) to analyze all 2-char combinations Enter ASTERISKS (**,***,****,etc.) to analyze characters appearing that many times consecutively Enter 'X_' (where X is any letter/number) to see what characters appear AFTER X Enter '_X' (where X is any letter/number) to see what characters appear BEFORE X Add '+' to INCLUDE the first 2 characters (e.g., '2+' or 'A+') Add '-' to EXCLUDE strings with consecutive identical characters (e.g., '2-') You can combine modifiers (e.g., '2+-') Without modifiers, the first 2 characters are ignored and consecutive chars are included

Enter your choice: a Ignoring first 2 characters in analysis

Analyzing letter frequencies: Letters that appear: 26/26 Letters that don't appear: 0/26 Total letter occurrences: 174774

Letter frequencies: A: 6658 occurrences (3.8%) B: 6867 occurrences (3.9%) C: 6915 occurrences (4.0%) D: 6761 occurrences (3.9%) E: 6909 occurrences (4.0%) F: 6944 occurrences (4.0%) G: 6754 occurrences (3.9%) H: 6890 occurrences (3.9%) I: 6924 occurrences (4.0%) J: 6943 occurrences (4.0%) K: 6785 occurrences (3.9%) L: 6856 occurrences (3.9%) M: 4572 occurrences (2.6%) N: 6778 occurrences (3.9%) O: 6659 occurrences (3.8%) P: 6693 occurrences (3.8%) Q: 6705 occurrences (3.8%) R: 6728 occurrences (3.8%) S: 6749 occurrences (3.9%) T: 6908 occurrences (4.0%) U: 6802 occurrences (3.9%) V: 6644 occurrences (3.8%) W: 6838 occurrences (3.9%) X: 6910 occurrences (4.0%) Y: 6770 occurrences (3.9%) Z: 6812 occurrences (3.9%)

Top 5 most frequent letters: F: 6944 (4.0%) J: 6943 (4.0%) I: 6924 (4.0%) C: 6915 (4.0%) X: 6910 (4.0%)

Top 5 least frequent letters (that appear): P: 6693 (3.8%) O: 6659 (3.8%) A: 6658 (3.8%) V: 6644 (3.8%) M: 4572 (2.6%)

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current02:36, 2 December 2025Thumbnail for version as of 02:36, 2 December 20251,492 × 788 (60 KB)Breadpan (talk | contribs)An analysis of how often alphabetic characters appear in EAMs. This was generated by '''eam_characterfreqanalysis_.py''', which is freely available at [https://github.com/NEETINTEL/EAM_SOFTWARE_SUITE], and an (at time of running) unreleased EAM dataset. %Run eam_characterfreqanalysis_.py Successfully loaded 12247 rows from SET-ALL Found messages with forbidden digits (0,1,8,9) on dates: 2022.01.29, 2025.06.10, 2025.06.11, 2025.11.18 Filtered out 13 messages containing digits 0, 1, 8, or 9 F...

The following page uses this file:

Metadata