[{"data":1,"prerenderedAt":277},["ShallowReactive",2],{"blog-en-how-food-photo-recognition-works":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"slug":10,"date":11,"cover":12,"category":13,"tags":14,"translationSlug":19,"body":20,"_type":271,"_id":272,"_source":273,"_file":274,"_stem":275,"_extension":276},"\u002Fen\u002Fblog\u002F2026-06-13-how-food-photo-recognition-works","blog",false,"","How we built food photo recognition — what's under the hood","A technical breakdown of NutriApp's photo recognition — which approaches we tried, what we landed on, where it still falls short, and where we're taking it next. No marketing gloss.","how-food-photo-recognition-works","2026-06-13","\u002Fblog\u002Fimages\u002Fhow-food-photo-recognition-works\u002Fcover.jpg","dev-log",[15,16,17,18,13],"photo recognition","AI","computer vision","product","kak-rabotaet-raspoznavanie-blyud-po-foto",{"type":21,"children":22,"toc":262},"root",[23,31,38,43,48,53,59,78,88,98,103,109,118,123,129,141,152,163,174,180,192,197,203,208,243,257],{"type":24,"tag":25,"props":26,"children":27},"element","p",{},[28],{"type":29,"value":30},"text","Today — about how photo recognition actually works in NutriApp. No marketing gloss, just what we tried, what landed, and what didn't.",{"type":24,"tag":32,"props":33,"children":35},"h2",{"id":34},"why-bother",[36],{"type":29,"value":37},"Why bother",{"type":24,"tag":25,"props":39,"children":40},{},[41],{"type":29,"value":42},"In classic trackers, logging a meal takes 1-3 minutes per dish. Open search, pick the right entry out of 47 lookalikes, set the grams, repeat for every ingredient. Over a day that's 5-10 minutes of \"extra\" busywork.",{"type":24,"tag":25,"props":44,"children":45},{},[46],{"type":29,"value":47},"By day three most people quit. Not from laziness — the input cost simply grows larger than the value of tracking.",{"type":24,"tag":25,"props":49,"children":50},{},[51],{"type":29,"value":52},"The whole point of photo mode is to cut that cost down to ~20 seconds: photograph the plate, the app breaks it into ingredients with estimated grams, you tweak it and save.",{"type":24,"tag":32,"props":54,"children":56},{"id":55},"what-we-tried",[57],{"type":29,"value":58},"What we tried",{"type":24,"tag":25,"props":60,"children":61},{},[62,68,70,76],{"type":24,"tag":63,"props":64,"children":65},"strong",{},[66],{"type":29,"value":67},"Off-the-shelf image recognition services (Google Vision, Clarifai).",{"type":29,"value":69}," They tell you the picture contains \"pizza\" or \"salad bowl\" but don't break the dish into ingredients with grams. Useless for calorie counting — we need the ",{"type":24,"tag":71,"props":72,"children":73},"em",{},[74],{"type":29,"value":75},"composition",{"type":29,"value":77},", not the label.",{"type":24,"tag":25,"props":79,"children":80},{},[81,86],{"type":24,"tag":63,"props":82,"children":83},{},[84],{"type":29,"value":85},"Training our own CV model.",{"type":29,"value":87}," The idea: collect a dataset of regional dishes and train a model. We quickly realized you need hundreds of thousands of labeled plates for acceptable accuracy, and labeling costs millions. Unrealistic for our team size.",{"type":24,"tag":25,"props":89,"children":90},{},[91,96],{"type":24,"tag":63,"props":92,"children":93},{},[94],{"type":29,"value":95},"Vision models from Anthropic.",{"type":29,"value":97}," This landed. The model looks at a plate photo the way a human does: \"grilled chicken breast ~120 g + rice ~150 g + mixed veg salad ~80 g\", and returns a structured JSON we drop into the meal log. No fine-tuning — a general model plus a well-crafted prompt, and it handles home cooking, café plates, fast food, simple restaurant dishes.",{"type":24,"tag":25,"props":99,"children":100},{},[101],{"type":29,"value":102},"That's where we landed.",{"type":24,"tag":32,"props":104,"children":106},{"id":105},"what-it-looks-like-in-the-app",[107],{"type":29,"value":108},"What it looks like in the app",{"type":24,"tag":25,"props":110,"children":111},{},[112],{"type":24,"tag":113,"props":114,"children":117},"img",{"alt":115,"src":116},"Food photo recognition — NutriApp","\u002Fblog\u002Fimages\u002Fhow-food-photo-recognition-works\u002Frecognize-screen.webp",[],{"type":24,"tag":25,"props":119,"children":120},{},[121],{"type":29,"value":122},"You upload a photo and get back a caption (\"baked chicken breast, sliced, with fresh green salad, broccoli and red onion\"), a list of ingredients with grams and macros, you pick the meal type (breakfast \u002F lunch \u002F dinner \u002F snack) and save. Each ingredient's weight can be adjusted in one tap.",{"type":24,"tag":32,"props":124,"children":126},{"id":125},"where-it-isnt-perfect",[127],{"type":29,"value":128},"Where it isn't perfect",{"type":24,"tag":25,"props":130,"children":131},{},[132,134,139],{"type":29,"value":133},"— ",{"type":24,"tag":63,"props":135,"children":136},{},[137],{"type":29,"value":138},"Grams from a photo are always an estimate.",{"type":29,"value":140}," A ±15-20 g miss per ingredient is normal. That's why there's a mandatory manual-tweak step after recognition: you see the suggested grams and can fix them.",{"type":24,"tag":25,"props":142,"children":143},{},[144,145,150],{"type":29,"value":133},{"type":24,"tag":63,"props":146,"children":147},{},[148],{"type":29,"value":149},"Complex layered dishes recognize worse than simple ones.",{"type":29,"value":151}," Casseroles, multi-layer sandwiches, home stews where ingredients hide under each other — the model often only sees the top layer.",{"type":24,"tag":25,"props":153,"children":154},{},[155,156,161],{"type":29,"value":133},{"type":24,"tag":63,"props":157,"children":158},{},[159],{"type":29,"value":160},"Dark backgrounds, bad light, weird angles cut accuracy.",{"type":29,"value":162}," It works best top-down or 3\u002F4 view on a neutral background in daylight.",{"type":24,"tag":25,"props":164,"children":165},{},[166,167,172],{"type":29,"value":133},{"type":24,"tag":63,"props":168,"children":169},{},[170],{"type":29,"value":171},"Region-specific dishes.",{"type":29,"value":173}," Uzbek manty, plov, homemade dumplings — the model sees \"meat dumplings\" and guesses grams from the average. Fine as a starting point, but for precision the catalog route is better.",{"type":24,"tag":32,"props":175,"children":177},{"id":176},"what-worked",[178],{"type":29,"value":179},"What worked",{"type":24,"tag":25,"props":181,"children":182},{},[183,185,190],{"type":29,"value":184},"For an \"ordinary day\" — a bowl of oatmeal, chicken with a side, a sandwich with coffee — photo mode saves real minutes. And the bigger effect: ",{"type":24,"tag":63,"props":186,"children":187},{},[188],{"type":29,"value":189},"people start logging more often",{"type":29,"value":191},", because the meal-entry barrier dropped.",{"type":24,"tag":25,"props":193,"children":194},{},[195],{"type":29,"value":196},"You can see it in analytics directly: users who use photo mode actively retain at multiples of those who only log via the catalog. Logical — input cost is the main barrier in calorie tracking.",{"type":24,"tag":32,"props":198,"children":200},{"id":199},"whats-next",[201],{"type":29,"value":202},"What's next",{"type":24,"tag":25,"props":204,"children":205},{},[206],{"type":29,"value":207},"In the pipeline:",{"type":24,"tag":209,"props":210,"children":211},"ul",{},[212,223,233],{"type":24,"tag":213,"props":214,"children":215},"li",{},[216,221],{"type":24,"tag":63,"props":217,"children":218},{},[219],{"type":29,"value":220},"Better gram estimation for home-cooked dishes",{"type":29,"value":222}," — stews, pilafs, casseroles. The plan is a vision + recipe-base pairing: the model recognizes \"borscht\", we pull a typical composition, then scale grams to the visible bowl size.",{"type":24,"tag":213,"props":224,"children":225},{},[226,231],{"type":24,"tag":63,"props":227,"children":228},{},[229],{"type":29,"value":230},"Saved \"your\" dishes.",{"type":29,"value":232}," If you eat the same meal often — no need to re-recognize each time. Photograph once → save as a template → log with one click after that.",{"type":24,"tag":213,"props":234,"children":235},{},[236,241],{"type":24,"tag":63,"props":237,"children":238},{},[239],{"type":29,"value":240},"Packaging recognition.",{"type":29,"value":242}," Photograph the nutrition label on the package — we parse calories and macros and create the product in your database. Especially useful for brands missing from the general catalog.",{"type":24,"tag":25,"props":244,"children":245},{},[246,248,255],{"type":29,"value":247},"If you want to try it — ",{"type":24,"tag":249,"props":250,"children":252},"a",{"href":251},"\u002F",[253],{"type":29,"value":254},"open NutriApp",{"type":29,"value":256},". Photo mode is in the free tier with a daily cap; paid tiers remove the cap.",{"type":24,"tag":25,"props":258,"children":259},{},[260],{"type":29,"value":261},"And if you use similar tools and notice we recognize worse than competitors on something — tell us in the comments. We actually iterate on those reports.",{"title":7,"searchDepth":263,"depth":263,"links":264},2,[265,266,267,268,269,270],{"id":34,"depth":263,"text":37},{"id":55,"depth":263,"text":58},{"id":105,"depth":263,"text":108},{"id":125,"depth":263,"text":128},{"id":176,"depth":263,"text":179},{"id":199,"depth":263,"text":202},"markdown","content:en:blog:2026-06-13-how-food-photo-recognition-works.md","content","en\u002Fblog\u002F2026-06-13-how-food-photo-recognition-works.md","en\u002Fblog\u002F2026-06-13-how-food-photo-recognition-works","md",1781331365691]