I Fixed 25 Duplicate Transactions by Hand, Then Built the Tool to Never Need That Again
I use my own app for my own money, which is the only honest way to build one. A few weeks ago I sat down to reconcile a month and found my ledger was heavier than my bank said it should be. Twenty-five rows did not belong. Some were photo receipts dated to the wrong year, some were credit card installments stacked on one day, and the embarrassing ones were plain duplicates that a feature I had already shipped was supposed to catch. It turned out that feature never actually ran. Here is the whole story, and what it taught me about why finance apps double your spending.
I am writing this in my own voice, not Capi's, because it is a confession before it is a lesson. It is easy to put a tidy dashboard in front of people and never look hard at your own numbers. The day I did look hard, I caught my product lying to me by accident, and the fix was both simpler and more humbling than I expected. If you have ever wondered whether your budget app is quietly inflating your totals, this is how to find out, and how the problem actually gets solved.
Why did I have 25 duplicate transactions in my own app?
Three faults stacked up at once. A vision model read a batch of photo receipts and put the wrong year on some of them, so they sat in the wrong month. A credit card PDF collapsed every installment onto the purchase date, so a single split bill read as repeated charges. And worst, my own dedup step tagged duplicates in the import preview but never saved that tag, so the final commit re-read the untagged data and inserted everything anyway. Counted together, twenty-five rows.
None of these were exotic. Each is the kind of small, boring fault that ships when a feature is tested in pieces and never end to end on real data. The vision year error showed up only because I photograph receipts in the wild and some were creased. The installment collapse showed up only because I import an Inter PDF that lists parcelas. And the dedup ghost showed up only because I uploaded a statement that overlapped one I had sent before, which is exactly the situation dedup exists for. My own messy month was the test case I had never written.
What causes duplicate transactions in finance apps?
The usual causes are a pending charge and its posted version both kept, a reconnect that re-imports a window already loaded, the same account added twice, a manual entry colliding with an auto-import, and parsing faults like installments collapsing to one date. Each one quietly inflates your totals, and most apps surface no warning when it happens. You only notice when a category looks too high or the month does not match the statement.
I went deep on the cross-app version of this in why budget apps keep duplicating transactions, but the short version is that duplicates are an input problem, not a user problem. Banks send the same charge twice in two states. Aggregators re-pull overlapping windows after a connection drops. Migrations between apps, like the wave of people who left Mint when it shut down on March 23, 2024, carry doubles across with them. The transactions are real once. The plumbing is what turns one into two.
What was the bug in my own dedup feature?
It was the worst kind of bug, the sort that looks like it works. The import preview correctly detected the duplicate rows and labeled them to be skipped, so on screen the feature appeared to function. But the label lived only in memory during the preview and was never written back to the staged import. When I confirmed, the commit step re-read the original, untagged data and inserted every row, skips included. The dedup count sat at zero forever, which I had read as no duplicates rather than nothing skipped.
That gap between looks-right and is-right is the whole danger of finance software. A dashboard that shows a clean number feels authoritative even when the number is wrong, a point I made at length in why finance apps lie to you about your spending. My dedup was theatre: the preview reassured me while the commit ignored it. The fix was three lines that persist the tagged result so the commit reads the same data the preview showed. The lesson cost me an afternoon of manual cleanup and a fair amount of pride.
How do you fix duplicate transactions by hand?
You match each duplicate by amount, date, and merchant, then delete one copy and keep the other, never both. Keep the version that carries the correct category and any note you added, and remove the bare one. When the duplicate came from a bad import rather than a one-off, fix the source so the next import does not repeat it. After every batch, reconcile the corrected total against the bank statement before you trust the number again.
Doing it by hand for twenty-five rows was tedious but clarifying. Sorting by amount instead of date made the pairs jump out, because two identical values land next to each other while a date view scatters them. The mis-dated photo rows I simply moved to the right month. The installment stack I deleted and re-imported once the parser was fixed to spread parcelas across the months they actually fall in. By the end I trusted the ledger again, and I had a precise list of everything the software should have prevented.
How can you tell if your budget app is duplicating transactions?
Compare the app total to your bank statement for the same month. If the app reads higher, sort by amount instead of date to surface identical pairs, check whether a pending and posted version of the same charge both survived, and look for installments that all share the purchase date. Duplicates push the total up, missing rows pull it down. The direction of the gap tells you which problem you have before you hunt for a single row.
If you want a repeatable check, run through these five steps on any tracker before you rely on its numbers.
- Put the app total and the real statement total for one month side by side. A higher app total points to duplicates.
- Sort by exact amount, not by date, so identical values sit together.
- Look for a pending charge and its posted twin that both survived a sync.
- Scan for installments that all share the purchase date instead of spreading across months.
- Filter for any transaction dated outside the current period, the signature of a photo or scan that got the year wrong.
Which finance apps duplicate transactions, and how do they dedupe?
Every app I have used produces duplicates under the right conditions, and what separates them is whether they catch it and how. Bank-sync apps lean on the aggregator to link pending and posted charges, which mostly works until a reconnection re-pulls an old window. File and chat tools depend on matching rules at import. Here is the honest layout of where doubles come from in each and how each one tries to stop them.
| Tool | Common duplicate source | How it dedupes | Re-import safe | Price (2026) |
|---|---|---|---|---|
| Capi | Overlapping statement uploads | Source row hash, skips on commit | Yes, by row hash | Free 30/mo; $9.90/mo or $69.90/yr |
| YNAB | Manual entry plus import | Matches on import, you approve | Mostly, on approval | $14.99/mo or $109/yr |
| Monarch Money | Reconnect re-pull | Sync matching, manual merge | Varies by setup | $99.99/yr; Plus $199/yr |
| Copilot Money | Pending and posted both kept | Auto-match, you confirm | Sync managed | $13/mo or $95/yr |
| Mint to Credit Karma | Migration carryover | Limited, often manual | No, doubles carried | Mint closed 2024 |
YNAB is the cleanest contrast to my own approach, and it is a good design. It shows you matched imports and makes you approve them, so the human is the last check. I lay out where Capi and YNAB differ in full at Capi vs YNAB. Copilot leans hardest on automatic pending-to-posted matching, which is smooth when the connection is healthy and brittle when it is not, and I compare that model at Capi vs Copilot Money. None of these are wrong. They just put the safety check in different places.
The short version. I found 25 bad rows in my own ledger: photo receipts with the wrong year, installments stacked on one date, and plain duplicates my dedup step had failed to skip because it tagged them in the preview but never saved the tag. I fixed them by hand, then fixed the code so the skip runs at commit, the parser spreads installments across months, and out-of-range dates from photos get flagged. The test that caught it was using the thing myself.
How does Capi prevent duplicate transactions now?
Capi fingerprints the source row of every transaction with a hash and compares new rows against what is already in your ledger, so a re-uploaded statement or an overlapping export imports only the genuinely new rows. After the audit I fixed the step that tagged duplicates but failed to persist the tag, so the skip now runs at commit time, not only in the preview. The installment parser spreads parcelas across the months they fall in, and photo dates outside a sane range get flagged rather than saved.
Where Capi will still need you, stated plainly. The row hash protects file and statement imports well, but a duplicate you type by hand that matches nothing in an existing file is on you to catch, the same as anywhere. Photo and voice capture is fast but worth a glance before you confirm, since a vision model can still misread a faded receipt. And no tool removes the need to reconcile against the bank once in a while. The honest promise is specific: re-uploading the same statement will not double your spending, and the dedup you see in the preview is the dedup that actually happens. If you want the broader picture of how Capi handles imports, the statement to budget walkthrough covers it.
What did dogfooding teach me about building finance tools?
That you cannot trust a feature you have only watched work in a demo. The dedup looked correct every time I tested the preview, because the preview was the part that worked. Only by running my own real, overlapping, photographed, installment-laden month through the whole pipeline did the gap appear. Eating your own dog food is not a slogan here, it is the only test that exercises the messy path users actually take, and it is the test I had skipped.
The deeper point is that finance software earns trust by being checkable, not by looking polished. I would rather tell you my dedup was theatre for a while and is fixed now than show you a flawless screenshot. The whole reason I track every currency and every receipt myself, which I wrote about in the seven-currency thirty-day test, is that the bugs only surface on real data. Twenty-five rows cost me an afternoon. They also made the product honestly better, which is the trade I will take every time.
Check your own month for duplicates.
Send a statement to Capi, and re-uploading an overlapping period will not double your spending, because every row is fingerprinted and the skip runs at commit.
Capi Free covers 30 transactions a month. Capi Core is $9.90 a month or $69.90 a year.
Frequently asked questions about duplicate transactions
Why did I have 25 duplicate transactions in my own app?
Three faults stacked up. A vision model read a batch of photo receipts and put the wrong year on some of them, a credit card PDF collapsed every installment onto the purchase date so they looked repeated, and worst, my own dedup step tagged duplicates in the preview but never saved that tag, so the commit re-read the untagged data and inserted everything. The count came to 25 rows.
What causes duplicate transactions in finance apps?
The usual causes are a pending charge and its posted version both kept, a reconnect that re-imports a window already loaded, the same account added twice, a manual entry colliding with an auto-import, and parsing faults like installments collapsing to one date. Each one quietly inflates your totals, and most apps surface no warning when it happens.
How can you tell if your budget app is duplicating transactions?
Compare the app total to your bank statement for the same month. If the app reads higher, sort by amount instead of date to surface identical pairs, check whether a pending and posted version of the same charge both survived, and look for installments that all share the purchase date. Duplicates inflate the total, missing rows lower it.
How does Capi prevent duplicate transactions?
Capi fingerprints the source row of every transaction with a hash and compares new rows against what is already in your ledger, so a re-uploaded statement or an overlapping export imports only the genuinely new rows. After the audit I also fixed the step that tagged duplicates but failed to persist the tag, so the skip now actually runs at commit time instead of only in the preview.
Do all finance apps duplicate transactions?
Most do at some point, because the inputs are messy: pending and posted charges, reconnections, overlapping statements, and migrations between apps all create doubles. The difference is whether the app catches them. Some match on import, some leave it to you, and some, as I learned about my own, ship a dedup step that does not run. No tool is immune by default.
How do you remove duplicate transactions safely?
Find the genuine duplicate by matching amount, date, and merchant, then delete one copy, not both. Keep the version that carries the correct category and any notes. If the duplicate came from a bad import, fix the source so the next import does not repeat it. Always reconcile the corrected total against the bank statement before trusting it.