You opened your weather app last night and it promised a beautiful sunset. This morning, it said nothing at all about whether that sunset actually arrived.
That silence is the industry standard. A forecast is a confident claim, and almost none of those claims are ever checked against the sky that showed up.
Sunset Verify is our refusal to keep that bargain. We grade our own sunset-color forecasts in public, the morning after, with a letter score any reader can pull up and argue with.
What Is Sunset Verify?
Sunset Verify is the part of the journal where Vesper holds itself accountable for a single falsifiable prediction: how vivid tonight's sunset will be. The afternoon before, we publish a color score; the next morning, we publish how close we got.
Sunset Verify grades every sunset-color forecast Vesper publishes against the sky that actually arrived. The afternoon prediction earns a public score the next morning — hit or miss, posted where any reader can see it.
This is not a confidence interval buried in a settings menu. The grade sits next to the forecast it judges, so the prediction and its report card travel together.
If you want the physics behind why one evening turns molten and the next stays gray, we covered that in why sunsets differ from one night to the next. Sunset Verify is the accountability layer that sits on top of that science.
Why No Other Weather App Grades Itself
The dominant pattern in weather software is prediction without reconciliation. The app tells you tomorrow's high, tomorrow's chance of rain, tomorrow's golden hour — and then tomorrow becomes today and the slate is wiped clean.
We think this is wrong, and we think the reason it persists is structural rather than technical. Nothing forces a weather app to check its own work, so almost none of them do.
Most weather apps never check their own forecasts because nobody makes them. A prediction with no scorecard can never be wrong out loud, which is comfortable for the app and useless for the reader.
Naming names: Dark Sky, Carrot, and Apple Weather all show a forecast and none of them show a track record. That is not a swipe at their engineering, which is often excellent — it is an observation about what they choose to make visible.
An unchecked forecast is a comfortable thing to ship. It can never be wrong out loud, it never produces an awkward number, and it never invites the reader to keep score.
Comfortable for the app is precisely the problem. A prediction you are never allowed to audit is a prediction you are being asked to take on faith.
| Dimension | Typical weather app | Sunset Verify |
|---|---|---|
| Forecast shown | Yes | Yes |
| Forecast checked after the fact | No | Every night |
| Errors published | Never | Prominently |
| Scorecard visible to the reader | None | A through F |
How We Score A Sunset Forecast
The forecast itself is a single number we call the color score, from 0 to 100, published in the afternoon. It estimates how saturated and dramatic the sky will be in the window around sunset.
That number is not a guess about the weather in general. It is a guess about a narrow set of conditions that have to line up near the horizon for color to happen at all.
We forecast a 0–100 color score the day before, then compare it to observed cloud height, humidity, and aerosols after dark. The gap becomes a letter grade, A through F, published the next morning.
After the sun is down, verification begins. We compare the prediction against what actually occurred, drawing on a few independent sources:
- Cloud structure and altitude. Mid- and high-level clouds catch under-lighting and turn vivid, while a low marine layer or a perfectly clear dome both tend to kill the show — and each fails differently.
- Humidity and aerosols. Moisture and particulate load shift the color temperature of the light and decide whether the reds deepen or wash out into haze.
- Observed sky and reader photos. Satellite imagery tells us what the cloud deck did; reader-submitted frames tell us what it looked like from the ground, which is the only thing that ultimately matters.
The gap between the predicted color score and the observed result becomes a letter grade. An A means we called a vivid sky and got one, or called a flat sky and were right to; an F means we sent you to the window for nothing, or worse, kept you inside during the best light of the week.
That last failure mode matters more to us than the first. Missing a great sunset is the error we work hardest to drive toward zero, because it is the one that actually costs the reader something — a point we make in our guide to reading golden hour light.
What We Got Wrong — And Published Anyway
The feature only means anything if the bad grades stay up. So they stay up.
There are evenings when we forecast a 20 and the western sky detonates into crimson, and the next morning Sunset Verify says, plainly, that we blew it. We do not quietly recompute the score or soften the language once the photos come in.
This is the editorial discipline that makes the rest of the journal credible. A publication that will tell you when it was wrong about the sky is a publication you can believe when it says the morning is worth getting up for.
Our rule for Sunset Verify is simple: the grade goes up before we know whether it flatters us, and it stays up after we find out it doesn't.
The Design Tradeoffs We Actually Made
Building Sunset Verify meant choosing, repeatedly, between what flattered us and what served the reader. None of these calls were obvious, and a couple of them still get argued internally.
Here are the tradeoffs that shaped the feature:
- Honesty versus polish. We could have graded ourselves on a generous curve to keep the average high, but a scorecard you tune in your own favor is marketing with a number stapled to it.
- Defining "right" for a subjective thing. Sunset beauty is partly in the eye of the viewer, so we committed to a measurable proxy — color saturation and the illumination of mid-level cloud — and accept that it will occasionally disagree with how a night felt.
- Where the grade lives. Putting the report card next to the forecast risks undermining the forecast, and we did it anyway, because burying the grade two taps away would have been a quieter way of not really shipping it.
- The temptation to hedge. The single fastest way to raise our accuracy would be to predict a dull sky every night, and we ruled that out on day one.
The hardest tradeoff was honesty versus polish. We could have graded ourselves gently to protect the average, but a scorecard you tune in your own favor is just marketing with a number on it.
That last temptation deserves its own line, because it is the one that quietly destroys features like this.
We refuse to hedge forecasts to protect the score. Predicting a dull sky every night would lift our accuracy and make the feature worthless, so we forecast the sky we actually expect.
A forecast optimized to protect a score stops being a forecast and becomes a hedge. We would rather post an honest F on a night we misread than a safe C on a night we refused to commit.
What Public Accountability Does To A Forecast
The surprising part is what grading ourselves did to the forecasting itself. When every prediction is going to be scored in the morning, the afternoon work gets sharper.
You stop reaching for the comfortable hedge. You commit to a number, because a vague forecast and a confident one earn the same F when they are wrong, and only the confident one earns an A when it is right.
Publishing misses builds trust rather than spending it. A forecast you can audit is one you can believe; a forecast that hides its record is asking for faith it has not earned.
Readers feel this even if they never open the methodology. A forecast that keeps its own scorecard reads as a forecast made by someone who expects to be checked — which is exactly the posture we want every brief written in, as we explained in how Vesper writes a daily brief.
Trust is not built by being right every time, which no honest forecaster can promise. It is built by being checkable, and then surviving the check often enough to matter.
Why This Fits The Rest Of Vesper
Sunset Verify is not a gimmick bolted onto a weather app. It is the same conviction that runs through the whole journal, expressed as a number.
We built Vesper on the belief that a forecast should be worth reading, not just worth glancing at — the case we lay out in full in what makes weather worth reading. A take you can audit is the most honest form of a take worth reading.
If you want the longer version of why we started a weather publication willing to argue with itself in public, it lives in why we built Vesper. Sunset Verify is that philosophy with the stakes turned up, because here the reader gets to keep score.
Frequently Asked Questions
A few of the questions readers ask most often about how the feature works.
How does Sunset Verify score a sunset forecast?
We predict a 0–100 color score the afternoon before, then compare it to observed cloud height, humidity, and aerosols after sunset. The difference becomes a public A-to-F grade the next morning.
Why don't other weather apps grade their forecasts?
Because nothing forces them to. An unchecked forecast can never be publicly wrong, so most apps show a confident prediction and quietly move on to tomorrow.
Does showing your forecast errors hurt credibility?
The opposite. A forecast you can audit earns more trust than one that hides its record, and readers consistently tell us the misses are why they believe the hits.
Can Vesper game its own Sunset Verify score?
It could, by forecasting dull skies every night to keep accuracy high. We refuse to hedge that way, because a scorecard you tune in your own favor is worthless to the reader.
Why is sunset color so hard to forecast?
Color depends on mid-level cloud height, humidity, and aerosols lining up in a narrow window near the horizon. Small errors in any of them flip a forecast from vivid to gray.
Read Tonight's Forecast — And Tomorrow's Grade
Tonight's color score is already posted, and tomorrow morning it will carry a grade whether we earned it or not. That is the deal we made with you.
Open the Vesper journal this evening, read the sunset call, and come back in the morning to see how we did. If we were wrong, you will be the first to know, because we will be the ones to say so.
Frequently Asked Questions
What makes Vesper Sky different from other weather apps?
Vesper replaces template-driven forecasts with short editorial briefs written in an authorial voice, and publicly grades its own sunset predictions through Sunset Verify. Every other weather app on the market generates its text by filling variables into a template. Vesper writes each forecast as original prose with a point of view about the day.
Is Vesper Sky free?
No. Vesper Sky is a subscription app with no free tier. Monthly ($2.99) and annual ($24.99) plans both include a 3-day free trial, and a one-time lifetime purchase is available for $59.99. Downloading the app from the App Store is free, but using any feature requires an active subscription or a lifetime purchase.
What is Sunset Verify?
Sunset Verify is Vesper's signature feature that predicts sunset quality each day from live atmospheric data and lets users verify the prediction with a photo, building a personal accuracy track record over time.
When will Vesper Sky be available?
Vesper is currently in beta. Join the waitlist at vespersky.ai/beta to get early access and be notified when the app launches on iOS and Android.
What does it mean for a weather app to be editorial?
An editorial weather app applies a point of view to the same atmospheric data every other app has. Instead of showing you a grid of numbers, it writes a short brief — two or three sentences with intent — about what the day is going to feel like and what you should probably do about it. The data is identical. The voice is the product.
How does Vesper Sky write a brief if it is not a human writer?
Vesper's briefs are generated by a language model operating under an editorial style guide written by people and refined through thousands of examples. The style guide, cut discipline, and voice rules are the content. The model is the mechanism. Template weather apps are generated by models that were never given an editorial style guide, which is why they all sound identical.
Does Vesper Sky have radar maps or severe weather alerts?
Vesper does not ship radar maps or a proprietary severe weather alert system. Severe weather alerts come through the operating system, which is the right place for them. Radar was rejected because a radar map is not a brief and would not make the forecast more worth reading. We respect both as product decisions. We are doing something different.
Which cities does Vesper cover?
Vesper publishes editorial weather coverage for over 100 US cities with full daily briefs and all 50 state hubs with region-specific editorial context. The mobile app gives you a brief wherever you are — anywhere Vesper has weather data coverage, which is essentially every populated area in the world.
Is my location data private on Vesper?
Yes. Vesper uses your approximate location only to deliver weather forecasts for your area. Location data is not stored on our servers, not sold, and not shared with third parties. Photos taken through Sunset Verify stay on your device and never leave your phone.
How often does the Vesper Brief update?
A fresh editorial brief is generated every morning based on that day’s forecast. Inside the app, live conditions update continuously based on your location. The editorial brief is a once-a-day artifact — written to be read in the morning, not refreshed hourly.
Can I use Vesper without an account?
Yes. Vesper does not require an account to read the daily brief, check sunset predictions, or use the editorial features. Personal data like Sunset Verify history is stored locally on your device, so there is no cloud account to create.