Appendix - Methods: Unscripted: Even LLMs Think Trump’s Speeches are Strange
Before beginning the analysis, Trump’s speeches at the UN in New York City and at Quantico, Virginia, were sourced on the internet. Due to some formatting inconsistencies—like timestamps and line breaks—the speeches were edited and streamlined into a simplified “.txt” file format that Gemini could reliably work from during the building and testing of the script. After that, the interesting work began.
I asked Gemini to build a script that would first analyze the sentiment of both of Trump’s speeches at the UN and Quantico separately, before comparing them in a single graphic. Gemini built the analysis script using the Natural Language Toolkit (NTLK) library and its embedded Valence Aware Dictionary and s-Entiment Reasoner (VADER) analyzer. The NTLK library contains certain packages of scripted code that help tokenize and remove stopwords from the speech, essentially breaking down Trump’s language into individual words while eliminating filler terms like “and,” “a,” and “the.” The VADER analyzer is a bit more complicated to explain, but uses a defined lexicon (a list of words, each associated with a sentiment score) and context heuristics (used to interpret punctuation markings, capitalization, modifiers, negations, and other parts of speech) to score text along a positive-neutral-negative continuum. It also produces a normalized, weighted composite score that gives an overall sentiment of the text, ranging from most negative (-1) to most positive (+1), to help give a quick understanding of the overall speech itself.
Each prompt that Gemini was given was checked and edited if errors in Gemini’s reasoning were produced (after each prompt, Gemini produced a list of tasks to be completed, which were cross-checked with the original prompt). After each stage in the analysis, Gemini was asked to explain any errors—if applicable—and to run automatic debugging where possible. Errors in naming conventions or data sourcing were corrected manually. At the end of the analysis, Gemini was asked to explain the entire methodology of the model, accounting for all errors, use of Python packages, and visualization techniques. When explanations were artificially long or nebulous, Gemini was prompted to go deeper and point to specific areas in the code where it implemented changes or improvements based on past prompts. Though individual prompts were not tracked over time, chat history was continuously parsed to make sure Gemini’s active working memory was up-to-date with recent requests and modifications, ensuring a streamlined analysis from start to finish. All code was saved and pushed to Google Drive cloud storage for future use and iteration.
