This is a temporary line.
]]>As professional software developers, we are well aware that vulnerabilities in web applications can sometimes be puzzling, leaving us scratching our heads and wondering, “How is this even possible?” One such category of vulnerabilities that often falls into this enigmatic realm is known as web race conditions. In this technical blog post, we will delve deep into the world of web race conditions, exploring their true potential and introducing a novel technique to detect and exploit them effectively.
Race Conditions in Web Applications
A race condition occurs when multiple concurrent processes or threads access shared resources simultaneously, leading to unpredictable and often unintended behavior. In web applications, race conditions manifest when specific sequences of user actions or HTTP requests lead to unexpected outcomes. These vulnerabilities are not always apparent and can be challenging to identify and reproduce.
Classic Examples of Web Race Conditions
Before we delve into the technical details, let’s look at some classic examples of web race conditions:
Reuse of Single-Use Vouchers: Imagine a scenario where a user attempts to redeem a single-use voucher multiple times within a short time window due to a race condition.
Bypassing Rate Limits: A race condition might allow a user to bypass rate limits imposed on certain actions, leading to potential abuse.
Reviewing a Product Multiple Times: In some cases, users might exploit a race condition to submit multiple reviews for a single product.
Recaptcha Exploitation: One intriguing example involves reusing a valid reCAPTCHA solution multiple times within a short time frame, as reported to Google.
Beyond Limit Overruns
While the examples above can be categorized as limit overruns, web race conditions are not limited to these scenarios. They can manifest in unexpected ways, leading to complex and subtle vulnerabilities that are not immediately apparent.
Race Conditions on the Web: The Untapped Potential
Josip Franjković’s blog post, “Race Conditions on the Web,” sheds light on the world of web race conditions. In this post, he describes a vulnerability that took two months to replicate, where Facebook would generate two confirmation codes for two different email addresses using different parameter names in a single email. This was not a typical limit overrun; it was something more complex.
The realization that race condition attacks can be far more intricate and impactful than initially thought prompted a deeper exploration of this attack class. In this blog post, we aim to uncover the true potential of web race conditions, providing you with tools, techniques, case studies, and a live demonstration of their capabilities.
Eliminating Network Jitter
One of the primary challenges in detecting race conditions is dealing with network jitter, which introduces random delays in requests, making it difficult to align the race windows. To overcome this obstacle, we introduce the “single-packet attack.”
How the Single-Packet Attack Works
The single-packet attack is built on the principles of TCP and HTTP. It involves sending 20 to 30 HTTP requests within a single packet, ensuring that they arrive at the server simultaneously, regardless of network jitter. This innovative technique eliminates the variability introduced by network delays and significantly enhances the efficiency of race condition discovery.
Technical Implementation of the Single-Packet Attack
Implementing the single-packet attack is surprisingly straightforward. By creatively leveraging Nagle’s algorithm, which is present in all operating system network stacks, you can avoid the need to code a custom TCP or TLS stack. Instead, you can extend an existing HTTP/2 library to incorporate this feature, making it accessible and adaptable.
Measuring the Impact
To assess the effectiveness of the single-packet attack, we conducted performance benchmarking. We compared it with the previous best-known technique, last-byte synchronization. The results were compelling:
In practical terms, the single-packet attack makes remote race conditions appear local by eliminating network jitter, resulting in four to ten times improved performance compared to previous methods.
A Three-Step Methodology
Detecting and exploiting web race conditions requires a systematic approach. We propose a three-step methodology:
Prediction: Identify potential collision points in your application where race conditions might occur. Focus on areas with severe consequences.
Probe for Clues: Benchmark normal behavior and look for anomalies when sending requests concurrently. Deviations from the baseline may indicate the presence of a race condition.
Proof of Concept: Understand the behavior, replicate it, and explore the impact. Be prepared to investigate deeply and uncover hidden vulnerabilities.
Practical Application of the Methodology
To illustrate the methodology in action, let’s examine a case study involving GitLab’s email verification process. We initially discovered a race condition that allowed multiple invitations with the same email address, resulting in privilege escalation.
Expanding the Scope
While investigating the GitLab case, we didn’t stop at the first exploit we found. We explored the application further, seeking out security-relevant features that relied on the same structural weakness. This approach led to more significant discoveries and higher-impact vulnerabilities.
Web race conditions are a fascinating and often overlooked class of vulnerabilities that can have a substantial impact on web applications. By understanding their true potential, adopting innovative techniques like the single-packet attack, and following a systematic methodology for detection and exploitation, software developers can enhance the security of their applications.
In a constantly evolving threat landscape, staying ahead of the curve and exploring the intricacies of web race conditions can make all the difference in ensuring the resilience of your web applications. Happy hunting for those elusive vulnerabilities!
]]>The Concept behind the “Nagger” Technique: The “Nagger” technique is a personalized timer that acts as a gentle reminder to stay focused on a specific task. Inspired by the concept of overcoming ADHD traits, this technique aids in minimizing distractions and refocusing during unpleasant or mundane tasks. By incorporating the “Nagger” into your workflow, you can reduce the likelihood of becoming sidetracked and increase your overall efficiency.
Understanding the Shell Function: To implement the “Nagger” technique, we’ll utilize a shell function called nagme(). This function accepts two parameters: the duration in minutes and the task description. Let’s take a closer look at the code:
nagme () {
[ "$#" -ne 2 ] && printf "usage: $0 [in_minutes] [text]\n" && return 1
printf "sleeping $1 min before telling you to $2\n"
sleep $(echo $1\*60|bc)
espeak "$2" > /dev/null 2>&1
while :
do
sleep 30
echo -n '.'
espeak "I'm nagging you to $2" > /dev/null 2>&1
done
}
The nagme() function starts by checking if the correct number of arguments is provided. If not, it displays the correct usage and returns an error. Next, it informs you about the duration and task you’re about to undertake. The function then sleeps for the specified duration, after which it uses the espeak command to audibly notify you of the task at hand. The function enters a loop where it repeatedly echoes a dot and nags you verbally every 30 seconds until you stop it manually.
How to Incorporate the “Nagger” Technique: Implementing the “Nagger” technique is straightforward. Follow these steps to integrate it into your workflow effectively:
Step 1: Set Up the Environment: Ensure you have the necessary dependencies installed. In this case, we require the espeak command-line tool for speech synthesis. You can install it using your package manager.
Step 2: Define the nagme() Function: Copy the nagme() function code provided above and add it to your shell environment, such as your .bashrc or .zshrc file. Alternatively, you can define the function in a separate script and source it when needed.
Step 3: Utilize the “Nagger” Technique: To leverage the “Nagger” technique, call the nagme() function with the desired duration and task description as arguments. For example:
$ nagme 30 "Empty the washing machine"
This command initiates a 30-minute “Nagger” session that reminds you to empty the washing machine every 30 seconds until you manually stop it.
Step 4: Reap the Benefits: While working on your tasks, resist the temptation to turn off the “Nagger” prematurely. Allow the technique to work its magic by keeping you engaged and refocused until the task is complete. Over time, you’ll notice an improvement in your ability to maintain attention and avoid over-engineering during hyperfocus periods.
Conclusion: By incorporating the “Nagger” technique into your software development workflow, you can combat distractions, enhance focus, and improve your overall productivity. This personalized timer acts as a gentle reminder, ensuring you stay on track and complete tasks efficiently. Give it a try and experience the positive impact it can have on your concentration and output. Happy coding!
]]>Imagine a world where resources are effortlessly identified using intelligent naming conventions. In the vast realm of protocols and systems, such identification has often been a source of chaos. But fear not, for a groundbreaking solution has arrived: the Named Information (NI) identifier scheme. Let’s dive into this fascinating realm and explore how it enhances resource naming with the magic of hash functions.
The NI URI Format: Cracking the Code Think of NI URIs as the secret code that unlocks the true identity of resources. By combining various components, including the scheme name, authority, digest algorithm, and digest value, NI URIs provide a standardized and structured approach to naming resources. For example, an NI URI might look like this: ni:///sha-256;UyaQV-Ev4rdLoHyJJWCi11OHfrYv9E1aGQAlMO2X_-Q. It’s like a digital fingerprint that uniquely identifies a resource in the vast digital universe.
Mapping with .well-known URI: The Gateway to Accessibility Imagine a scenario where some clients are unaware of the NI scheme. How can they access these named resources? Fear not, for the NI scheme has a secret gateway—the .well-known URI. By cleverly mapping NI URIs to HTTP(S) URLs using the .well-known namespace, clients without NI support can seamlessly access named resources. It’s like having a master key that opens doors to a whole new world of resources.
Binary Format: Compact and Powerful In a world where space is precious, a more compact representation is often desired. That’s where the binary format comes into play. It condenses the NI name into a header and hash value, making it more space-efficient. This binary format supports various hash algorithms and truncation lengths, ensuring flexibility while saving valuable resources. Imagine a 128-bit identifier efficiently representing a resource in a protocol that craves efficiency and speed.
Human-Speakable NIH URI Format: The Language of Resources Now, let’s explore the human side of resource naming. Sometimes, we need to speak the names of resources, whether over a phone call or in a voice command. That’s where the Human-Speakable (NIH) URI format shines. It provides an intuitive and easily pronounceable representation. For example, an NIH URI might look like this: nih:sha-256;uyaq-v-ev4rd-lohy-jjwci-11ohf-ryv9e-1agqa-lmo2x-q;5. It’s like a secret language that bridges the gap between humans and the digital realm.
Summary: With the Named Information (NI) identifier scheme, resource naming has evolved into a seamless and powerful process. The NI URI format brings structure and standardization, ensuring resources are accurately identified. The mapping with the .well-known URI extends accessibility to all clients, even those unaware of the NI scheme. The binary format optimizes space efficiency, while the human-speakable NIH URI format enables easy communication with resources. Together, these advancements revolutionize the way we name and interact with resources, simplifying our digital lives while enhancing security and usability. So, embrace the power of NI and unlock a world where resources reveal their true identities at a glance.
]]>Although this approach holds an allure of simplicity, it comes with significant drawbacks: it’s often computationally expensive and lacks modularity, making adjustments and fine-tuning a challenge.
Instead of the LLM maximalist approach, I propose an alternative: LLM pragmatism. The idea is to break down the task into separate, manageable pieces and treat the LLM as just another module in the system.
To illustrate this concept, let’s walk through a Python-coded example. Suppose you are building a social media reputation management tool that summarizes mentions of your company on platforms like Twitter or Reddit.
# Step 1: Extract mentions
def extract_mentions(posts, company):
mentions = [post for post in posts if company in post]
return mentions
company_mentions = extract_mentions(posts, 'your_company_name')
The above function represents the first step in our pipeline: extracting relevant mentions of your company from a list of posts. It’s a simple, deterministic function that doesn’t require an LLM.
However, for the next step, determining sentiment of each mention, we might initially want to use an LLM.
# Step 2: Determine sentiment using an LLM
def get_sentiment(texts, model):
sentiments = [model.predict(text) for text in texts]
return sentiments
model = SomeSentimentModel() # this should be a pre-trained LLM, fine-tuned for sentiment analysis
sentiments = get_sentiment(company_mentions, model)
The third step is summarizing. This is also something that could be done deterministically.
# Step 3: Summarize
def summarize_mentions(mentions, sentiments):
pos = [mention for mention, sentiment in zip(mentions, sentiments) if sentiment == 'positive']
neg = [mention for mention, sentiment in zip(mentions, sentiments) if sentiment == 'negative']
neutral = [mention for mention, sentiment in zip(mentions, sentiments) if sentiment == 'neutral']
summary = f"Positive mentions: {len(pos)}, Negative mentions: {len(neg)}, Neutral mentions: {len(neutral)}"
return summary
summary = summarize_mentions(company_mentions, sentiments)
print(summary)
To further optimize this pipeline, we could train a custom model for sentiment analysis. This requires generating training data, which can be assisted by the LLM.
# Generate training data
def generate_training_data(mentions, model):
training_data = [(mention, model.predict(mention)) for mention in mentions]
return training_data
training_data = generate_training_data(company_mentions, model)
# Correct the generated training data manually if needed
for i in range(len(training_data)):
print(f"Mention: {training_data[i][0]}, Sentiment: {training_data[i][1]}")
correct_sentiment = input("Enter the correct sentiment: ")
training_data[i] = (training_data[i][0], correct_sentiment)
Now, you can use this training data to train a more focused sentiment analysis model.
By following the LLM pragmatism approach, we’ve created an NLP pipeline that is modular, scalable, and more cost-effective, while still focusing on a robust and reliable final system.
Using LLMs in this way can help build better systems, breaking down knowledge barriers, aiding in data creation, and improving workflows. We should aim to use LLMs during development, but rely on them as little as possible during runtime, allowing us to train cheaper, faster, and more reliable replacements. This is what I believe is the best use of LLMs in modern Natural Language Processing projects.
]]>The sequence ~?
prints the help message containing all of the available escape sequences supported. For example,
the Terminal client in MacOS lists the available options noted below.
Supported escape sequences:
~. - terminate connection (and any multiplexed sessions)
~B - send a BREAK to the remote system
~C - open a command line
~R - request rekey
~V/v - decrease/increase verbosity (LogLevel)
~^Z - suspend ssh
~# - list forwarded connections
~& - background ssh (when waiting for connections to terminate)
~? - this message
~~ - send the escape character by typing it twice
(Note that escapes are only recognized immediately after newline.)
(For any of these escape sequences to work as intended, it is only recognized immediately after a new line.)
This means that for the escape sequence to take effect a preceding newline is required.
Additionally, if you used SSH to get into host A and then another SSH session to get from host A into host B, to break the A-B connection you need to issue ~~.
This is a proposal for JS language feature that optimizes for the fewest possible characters as through fewer characters means “more readable”, or more understandable.
The proposal is an entire language feature and syntax to support a single construct
a(b(c(d())))
Which is frankly not anything remotely common enough to warrant a custom syntax.
Their argument against this is that it’s hard to read if the nesting is too deep, and the temporaries are unclear, especially if their it’s reused. This is an absurd argument. Why should you be reusing a single temporary unless again you believe in optimizing for the fewest characters written. Heh, as if that were the core metric for developer productivity.
The proposal attempts to justify this by saying the proposal applies to other cases
a(b(c(), d()))
By reusing the % operator as an unnamed token is more readable
c() |> b(%, c() |> d
Which is more characters and I’d say much less readable, but also is half-assed: which function should get to be %?
The anchor also has orthogonality problems
a |> (c |> f(%)) |> d
The obvious response is “don’t do that, it’s unreadable”, but that fails for asknowledge that the only reason this occurs is this new syntax being insufficiently thought through. That’s because fundamentally the proposal owners only care about their a(b(c(d()))) case, and try to paper over this with the pretense of covering other cases.
]]>David Bohm’s System of Thought: David Bohm, although trained as a theoretical physicist, made significant contributions to philosophy and the exploration of consciousness. In “Thought as a System,” he proposes that thoughts do not occur in isolation but are part of a larger, interconnected system.
The Interconnectedness of Thought and Code: Bohm observes, “Every little movement of thought is part of the overall ‘flow’”. This insight is remarkably analogous to the process of software development. Just as thoughts are part of an interconnected system, so too are the lines of code that make up a software program. Each function, each variable, is part of a larger whole, demonstrating the intrinsic interconnectedness of code.
Fragmentation in Thought and Code: Bohm also argues that this system of thought can lead to a fragmented worldview, causing a myriad of societal issues. Analogously, fragmentation in software can result in bugs, crashes, and inefficient code. The parallels between Bohm’s concept of a thought system and the systemic nature of software development are uncanny.
Addressing Criticism: Pirsig’s Counterpoint: While Bohm’s ideas provide a thought-provoking perspective, they have also received their share of criticism. Some critics, including Robert Pirsig, the author of “Zen and the Art of Motorcycle Maintenance,” argue that Bohm’s theory oversimplifies complex realities. Pirsig, a champion of examining the quality of ideas, suggests that Bohm’s framework falls short of capturing the nuanced realities of thought and existence.
Pirsig contends that systems cannot be entirely comprehended or predicted through their individual components alone. Similarly, in software development, one could argue that understanding the intricacies of a program requires more than just a systemic perspective. It’s a reminder of the complexity and unpredictability inherent in both thought processes and coding.
A Broader Perspective for Software Development: David Bohm’s “Thought as a System” presents an interesting lens through which to view our role as software developers. While Bohm’s ideas should not be applied uncritically or too literally, recognizing the systemic nature of our work can offer a broader perspective on problem-solving and the implications of our creations.
In our rapidly evolving technological landscape, such an interconnected perspective could catalyze significant advancements. It’s a challenging proposition, but as Bohm aptly observes, “A change of meaning is necessary to change this world politically, economically and in every way.”
Rating: 4 out of 5
]]>Ed25519 keys are short. Very short. If you’re used to copy multiple lines of characters from system to system you’ll be happily surprised with the size. The public key is just about 68 characters. It’s also much faster in authentication compared to secure RSA (3072+ bits).
With ssh-keygen use the -o option for the new RFC4716 key format and the use of a modern key derivation function powered by bcrypt. Use the -a
ssh-keygen -a 100 -t ed25519 -f ~/.ssh/key-file -C "user@domain.tld"
Note the line ‘Your identification has been saved in /home/user/.ssh/id_ed25519’.
Then add the newly generated key to your SSH agent.
ssh-add ~/.ssh/key-file
The following is an example of using Duff’s device to unroll a loop that copies a block of memory.
void duff_memcpy(char *dest, char *src, size_t count) {
int n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *dest++ = *src++;
case 7: *dest++ = *src++;
case 6: *dest++ = *src++;
case 5: *dest++ = *src++;
case 4: *dest++ = *src++;
case 3: *dest++ = *src++;
case 2: *dest++ = *src++;
case 1: *dest++ = *src++;
} while (--n > 0);
}
}
In this example, the loop is unrolled by a factor of 8. The switch statement takes care of the remainder when the count of bytes to be copied is not divisible by 8. The result is that the loop runs faster, as it reduces the number of times the loop control needs to be executed. However, this technique should be used with caution as it can make the code more complex and harder to read and maintain.
An example of how you might use the duff_memcpy()
function in a program is shown below.
#include <stdio.h>
#include <string.h>
int main() {
char src[] = "Hello, World!";
char dest[sizeof src];
duff_memcpy(dest, src, sizeof src);
printf("%s\n", dest);
return 0;
}
A “real-world” use case is when we have a large data set that needs to be copied, and performance is critical. By using Duff’s device the developer can improve the performance of the data copy operation.
Please note that in C++, the standard library already provides a highly optimized version of memcpy, which is memcpy_s in VS and memcpy in GCC/Clang. So, using duff device is not recommended in C++ if you are not doing it for educational purpose or for specific case where you have a lot of data and performance is very critical.
]]>