Why do we place our trust in mathematics? Are we merely relying on the collective agreement of human minds? My point is that human cognition is fallible, and even a majority consensus can be wrong. Therefore, the fact that many people agree on a mathematical idea does not, by itself, guarantee that it is true.
. Several of you caught it right away. I showed one equation at the start but then solved a completely different one. Fixing it properly and will upload the corrected version in 1-2 days. Most of the video is actually fine. It's just the example equation that got scrambled. But math needs to be right, so I'm redoing those parts. Thanks for catching the error. This is embarrassing but also exactly why having an engaged audience matters. New version coming soon!
On ProofBench-Advanced—where models prove formal mathematical theorems—GPT-5 scores 20%. Gemini Deep Think IMO Gold hits 65.7%. DeepSeek Math V2 (Heavy) scores 61.9%.
That's second place—but Gemini isn't open source.
This is the best open math model in the world. And DeepSeek released the weights. Apache 2.0.
Here's what they discovered:
1/ Why Normal LLMs Break on Real Math
Most large language models are great at sounding smart, but: - They’re rewarded for the final answer, not the reasoning. - If they accidentally land on the right number with bad logic, they still get full credit. - Over time they become “confident liars”: fluent, persuasive, and sometimes wrong.
That’s fatal for real math, where the proof is the product.
To fix this, DeepSeek Math V2 changes what the model gets rewarded for: not just being right, but being rigorously right.
2/ The Core Idea: Generator + Verifier
Instead of one model doing everything, DeepSeek splits the job: 1. Generator – the “mathematician” - Produces a full, step-by-step proof.
2. Verifier – the “internal auditor” - Checks the proof for logical soundness. - Ignores the final answer. It only cares about the reasoning.
This creates an internal feedback loop: One model proposes, the other critiques.
3/ The Secret Sauce: 1.0/0.5/0.0
The verifier doesn't just say yes or no. It scores on three levels:
It's the referee saying: "You solved it, but this wouldn't pass peer review." When the generator sees 0.5, it re-reads its own proof, finds the weak steps, tightens the argument.
The model learns to debug its reasoning, not just guess better.
4/ Putnam, IMO, and ProofBench
- Putnam 2024 – ~118/120 - IMO-Gold level performance - On a “basic” proof dataset, V2 almost perfectly solves the set - On an “advanced” dataset with long, tricky proofs, it still performs strongly, while many other large models collapse in accuracy
Models without this internal verifier do okay on short, easy proofs… …and then fall off a cliff on long, complex ones.
DeepSeek’s architecture shows that built-in self-checking is the difference between “good at math questions” and “actually good at proofs.”
Big risk is if the generator gets smart and the verifier stays weak, the generator learns to game it.
Three-phase solution:
Phase 1 – Human Cold Start. Contest problems graded by expert mathematicians. Anchors the verifier to real standards.
Phase 2 – Meta-Verification. The verifier can start hallucinating errors—seeing problems that don't exist. Solution: a second model checks whether critiques are legitimate or noise.
Phase 3 – Scaled Compute. For the hardest problems, human labeling is too slow. Run many verification passes, use majority vote as training signal.
Humans set the rules. Compute scales them.
6/ Big Model, Big Hardware
DeepSeek Math V2 is a Mixture-of-Experts (MoE) model with about 685B parameters. - Only some “experts” are active per problem, so each step is cheaper than a dense 685B model - But all those parameters still have to live in GPU memory
The code is open. The bottleneck is compute.
7/ How You Actually Use It: Agent Mode
In practice, you don’t just send one prompt and get a perfect proof. Instead, you run it in agent mode, something like:
1. Ask it to solve a problem. 2. It generates a proof and a self-verification score. 3. If the score is 0.5, you feed its own critique back in: - “Refine this proof based on the issues you identified.” 4. Repeat this refinement loop a few times (e.g., up to 8 rounds). 5. Stop when it produces a 1.0 proof or you’re satisfied.
You're managing a feedback loop, not passively waiting for output.
8/ Limitations
Creativity. Great at formal reasoning and polishing proofs. Still struggles with problems needing genuinely novel insight.
Cost. Those record-setting scores rely on many proof attempts and verification runs. Real-world use means cheaper settings, slightly lower performance.
Residual Errors. The verifier is still a neural net. It can be fooled. Error rate is lower, not zero.
This is a big leap toward reliable reasoning—not "perfect AI mathematician."
9/ From Chatbots to Reasoners
DeepSeek Math V2 represents more than just a math milestone.
The pattern here will spread: - Split generation and verification - Train on proof quality, not just right answers - Add self-critique loops and meta-verifiers
This is the template for any domain where being wrong is expensive—code, science, law, anything that needs to survive peer review.
Most people say that my math videos contain contradictions, flaws, and inconsistencies. Some even suggested that I should stop making them or delete the channel altogether. But the real question is: why are they so afraid of contradictions? This has been my unique way of learning, and I call it "Learning by Contradiction." Let me explain what that means.
There are two main ways people learn something. The first is the traditional way. When someone teaches you something, you simply accept it, remember it, and use it the way you were taught. The second way is the method I followed—learning through contradiction. I didn’t just accept information as it was. Instead, I started by assuming something, tested it in different situations, and refined my understanding by identifying and resolving contradictions.
For example, suppose a teacher tells you that 2+2 = 4. If you ask why, they will say it’s because of the rules of addition, and then they will explain those rules. You will remember the explanation and use it as knowledge. But my approach was different. I assumed that 2+2 = 4 because of a certain reason, let’s call it reason X. Then I applied that reason to another problem, like 2+3. If I found inconsistency in the result, I knew that reason X was flawed. So I created a new reason, reason Y, which worked better. I kept doing this until I arrived at a consistent explanation for why 2+2 = 4. This process gave me a deeper and more personal understanding of the concept.
I didn’t start by reading books or asking teachers. I started by assuming something, and then I explored it until I reached a better understanding. For me, to truly understand "something", you must first understand "nothing". When you grasp what "nothing" is, only then can you begin to understand "something".
Especially in math, if you never face contradictions in your calculations, it means you're not learning anything new. You’re just using information, but you don’t have real knowledge. Knowledge comes from facing errors, identifying contradictions, and overcoming them. That’s how I grew.
Through learning by contradiction, I developed my own way of understanding different mathematical concepts. To learn about complex numbers, I created virtual numbers. To explore 2×2 matrices, I introduced quadrix numbers. To understand quaternions, I invented mokkabions. To explore why the derivative of e^x is e^x, I designed a new function called the S function. And to grasp why (-1)! is undefined, I constructed a system called singularity numbers.
People may criticize or reject this approach, but this is how I grew. This is how I learned. This is how I built real knowledge—one contradiction at a time.
Dimensional Algebra
i dont think its 2026
any way 534
1 month ago | [YT] | 4
View 5 replies
Dimensional Algebra
Why do we place our trust in mathematics? Are we merely relying on the collective agreement of human minds? My point is that human cognition is fallible, and even a majority consensus can be wrong. Therefore, the fact that many people agree on a mathematical idea does not, by itself, guarantee that it is true.
2 months ago | [YT] | 18
View 3 replies
Dimensional Algebra
Quick update on that Bernoulli video:
.
Several of you caught it right away. I showed one equation at the start but then solved a completely different one.
Fixing it properly and will upload the corrected version in 1-2 days.
Most of the video is actually fine. It's just the example equation that got scrambled. But math needs to be right, so I'm redoing those parts.
Thanks for catching the error. This is embarrassing but also exactly why having an engaged audience matters.
New version coming soon!
https://youtu.be/lJE9oSqrpOU
2 months ago (edited) | [YT] | 9
View 0 replies
Dimensional Algebra
DeepSeek just dropped an IMO gold-medalist model.
On ProofBench-Advanced—where models prove formal mathematical theorems—GPT-5 scores 20%. Gemini Deep Think IMO Gold hits 65.7%. DeepSeek Math V2 (Heavy) scores 61.9%.
That's second place—but Gemini isn't open source.
This is the best open math model in the world. And DeepSeek released the weights. Apache 2.0.
Here's what they discovered:
1/ Why Normal LLMs Break on Real Math
Most large language models are great at sounding smart, but:
- They’re rewarded for the final answer, not the reasoning.
- If they accidentally land on the right number with bad logic, they still get full credit.
- Over time they become “confident liars”: fluent, persuasive, and sometimes wrong.
That’s fatal for real math, where the proof is the product.
To fix this, DeepSeek Math V2 changes what the model gets rewarded for: not just being right, but being rigorously right.
2/ The Core Idea: Generator + Verifier
Instead of one model doing everything, DeepSeek splits the job:
1. Generator – the “mathematician”
- Produces a full, step-by-step proof.
2. Verifier – the “internal auditor”
- Checks the proof for logical soundness.
- Ignores the final answer. It only cares about the reasoning.
This creates an internal feedback loop:
One model proposes, the other critiques.
3/ The Secret Sauce: 1.0/0.5/0.0
The verifier doesn't just say yes or no. It scores on three levels:
1.0 = Rigorous, watertight
0.5 = Right idea, sloppy execution
0.0 = Fatal flaws
That 0.5 is the breakthrough.
It's the referee saying: "You solved it, but this wouldn't pass peer review."
When the generator sees 0.5, it re-reads its own proof, finds the weak steps, tightens the argument.
The model learns to debug its reasoning, not just guess better.
4/ Putnam, IMO, and ProofBench
- Putnam 2024 – ~118/120
- IMO-Gold level performance
- On a “basic” proof dataset, V2 almost perfectly solves the set
- On an “advanced” dataset with long, tricky proofs, it still performs strongly, while many other large models collapse in accuracy
Models without this internal verifier do okay on short, easy proofs…
…and then fall off a cliff on long, complex ones.
DeepSeek’s architecture shows that built-in self-checking is the difference between “good at math questions” and “actually good at proofs.”
Big risk is if the generator gets smart and the verifier stays weak, the generator learns to game it.
Three-phase solution:
Phase 1 – Human Cold Start. Contest problems graded by expert mathematicians. Anchors the verifier to real standards.
Phase 2 – Meta-Verification. The verifier can start hallucinating errors—seeing problems that don't exist. Solution: a second model checks whether critiques are legitimate or noise.
Phase 3 – Scaled Compute. For the hardest problems, human labeling is too slow. Run many verification passes, use majority vote as training signal.
Humans set the rules. Compute scales them.
6/ Big Model, Big Hardware
DeepSeek Math V2 is a Mixture-of-Experts (MoE) model with about 685B parameters.
- Only some “experts” are active per problem, so each step is cheaper than a dense 685B model
- But all those parameters still have to live in GPU memory
The code is open. The bottleneck is compute.
7/ How You Actually Use It: Agent Mode
In practice, you don’t just send one prompt and get a perfect proof.
Instead, you run it in agent mode, something like:
1. Ask it to solve a problem.
2. It generates a proof and a self-verification score.
3. If the score is 0.5, you feed its own critique back in:
- “Refine this proof based on the issues you identified.”
4. Repeat this refinement loop a few times (e.g., up to 8 rounds).
5. Stop when it produces a 1.0 proof or you’re satisfied.
You're managing a feedback loop, not passively waiting for output.
8/ Limitations
Creativity. Great at formal reasoning and polishing proofs. Still struggles with problems needing genuinely novel insight.
Cost. Those record-setting scores rely on many proof attempts and verification runs. Real-world use means cheaper settings, slightly lower performance.
Residual Errors. The verifier is still a neural net. It can be fooled. Error rate is lower, not zero.
This is a big leap toward reliable reasoning—not "perfect AI mathematician."
9/ From Chatbots to Reasoners
DeepSeek Math V2 represents more than just a math milestone.
The pattern here will spread:
- Split generation and verification
- Train on proof quality, not just right answers
- Add self-critique loops and meta-verifiers
This is the template for any domain where being wrong is expensive—code, science, law, anything that needs to survive peer review.
Paper: github.com/deepseek-ai/DeepSeek-Math-V2/blob/main/…
Model: huggingface.co/deepseek-ai/DeepSeek-Math-V2
2 months ago | [YT] | 7
View 1 reply
Dimensional Algebra
ai.studio/apps/drive/1YmRxyMuRRWzBMoIP91fB74EHpNcC…
2 months ago | [YT] | 8
View 0 replies
Dimensional Algebra
Why
4 months ago | [YT] | 17
View 7 replies
Dimensional Algebra
"The order of a finite field is a prime power, but the characteristic must be a prime."
6 months ago | [YT] | 8
View 2 replies
Dimensional Algebra
Most people say that my math videos contain contradictions, flaws, and inconsistencies. Some even suggested that I should stop making them or delete the channel altogether. But the real question is: why are they so afraid of contradictions? This has been my unique way of learning, and I call it "Learning by Contradiction." Let me explain what that means.
There are two main ways people learn something. The first is the traditional way. When someone teaches you something, you simply accept it, remember it, and use it the way you were taught. The second way is the method I followed—learning through contradiction. I didn’t just accept information as it was. Instead, I started by assuming something, tested it in different situations, and refined my understanding by identifying and resolving contradictions.
For example, suppose a teacher tells you that 2+2 = 4. If you ask why, they will say it’s because of the rules of addition, and then they will explain those rules. You will remember the explanation and use it as knowledge. But my approach was different. I assumed that 2+2 = 4 because of a certain reason, let’s call it reason X. Then I applied that reason to another problem, like 2+3. If I found inconsistency in the result, I knew that reason X was flawed. So I created a new reason, reason Y, which worked better. I kept doing this until I arrived at a consistent explanation for why 2+2 = 4. This process gave me a deeper and more personal understanding of the concept.
I didn’t start by reading books or asking teachers. I started by assuming something, and then I explored it until I reached a better understanding. For me, to truly understand "something", you must first understand "nothing". When you grasp what "nothing" is, only then can you begin to understand "something".
Especially in math, if you never face contradictions in your calculations, it means you're not learning anything new. You’re just using information, but you don’t have real knowledge. Knowledge comes from facing errors, identifying contradictions, and overcoming them. That’s how I grew.
Through learning by contradiction, I developed my own way of understanding different mathematical concepts. To learn about complex numbers, I created virtual numbers. To explore 2×2 matrices, I introduced quadrix numbers. To understand quaternions, I invented mokkabions. To explore why the derivative of e^x is e^x, I designed a new function called the S function. And to grasp why (-1)! is undefined, I constructed a system called singularity numbers.
People may criticize or reject this approach, but this is how I grew. This is how I learned. This is how I built real knowledge—one contradiction at a time.
#LearnByContradiction
#DimensionAlgebra
6 months ago | [YT] | 19
View 33 replies
Dimensional Algebra
✅ VIRTUAL UNIT AND THE LAMBERT W FUNCTION
The Lambert W function, denoted as W(x), is defined by the equation:
x = y * e^y
which implies
W(x) = y
Now, in the virtual number system, we have the relation:
j = π * e^(j/2)
Rewriting it:
π = j * e^(-j/2)
Multiplying both sides by -1:
-π = -j * e^(-j/2)
Divide both sides by 2:
-π/2 = (-j/2) * e^(-j/2)
This matches the standard form x = y * e^y, where:
x = -π/2
and
y = -j/2
So, applying the Lambert W function:
W(-π/2) = -j/2
Therefore:
j = -2 * W(-π/2)
6 months ago | [YT] | 15
View 7 replies
Dimensional Algebra
π in virtual number system.
#pi
6 months ago | [YT] | 33
View 4 replies
Load more