All Systems Operational

API Operational
90 days ago
100.0 % uptime
Today
Goal Get It! Applicant Portal Operational
90 days ago
99.71 % uptime
Today
Management Portal Operational
90 days ago
99.98 % uptime
Today
Goal Get It! Gmail Operational
90 days ago
100.0 % uptime
Today
Single Sign On (Auth0) Operational
90 days ago
100.0 % uptime
Today
Goal Get It! Google Workspace Operational
90 days ago
100.0 % uptime
Today
IT Service Desk Operational
90 days ago
100.0 % uptime
Today
Main Website Operational
90 days ago
100.0 % uptime
Today
Adobe For Goal Get It! Operational
90 days ago
100.0 % uptime
Today
Canva Operational
90 days ago
100.0 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
Feb 16, 2026
Resolved - Today, we completed a production maintenance window for the Goal Get It! Application Portal focused on reliability and failover hardening.

Changes completed
Stabilized multi-node production behavior behind Cloudflare Load Balancing.
Added/validated config parity checks across nodes to prevent node-specific errors.
Confirmed health-check behavior and failover routing.
Deployed and verified standalone Cloudflare-hosted maintenance fallback page.
Applied portal UX updates, including improved duplicate-account detection in registration flow.
Redirected legacy password-forgot route to the current reset flow.
Removed temporary outage banner now that service is fully restored.

Current status
Portal is operational and serving normally.
No data loss occurred; application data remains secure and intact.
We are continuing to monitor system health and failover behavior.

Feb 16, 17:37 PST
Feb 15, 2026

No incidents reported.

Feb 14, 2026
Postmortem - Read details
Feb 14, 18:20 PST
Resolved - This incident has been resolved.
Feb 14, 18:15 PST
Update - We are continuing to monitor for any further issues.
Feb 14, 17:10 PST
Monitoring - A fix has been implemented and we are monitoring the results.
----------------
On February 14, 2026 (approximately 11:00 AM–3:00 PM PT), we experienced an intermittent outage and degraded reliability during our migration from a single-node deployment to a two-node setup behind a Cloudflare Load Balancer. This incident was multi-causal: several configuration, dependency, network connectivity, and TLS/origin issues compounded while node 2 was being brought online. Because our previous health checks were too shallow, the load balancer could route traffic to a partially broken node, resulting in inconsistent behavior depending on which node served a given request. Throughout the incident, application data remained secure and intact.

What happened: A backend startup configuration issue caused Gunicorn workers to crash on one deployment path because WEB_CONCURRENCY was effectively empty, which made the node unhealthy. During node 2 bring-up, compose/dependency mismatches caused partial startup behavior. Node 2 initially could not reach the primary PostgreSQL instance (TCP 5432 timeouts), which prevented backend services from initializing reliably. Node 2 was also missing required production security configuration (BACKUP_ENCRYPTION_KEY while security maintenance was enabled), which caused Django worker boot failures. In parallel, TLS/origin setup on node 2 was incomplete: Caddy could not successfully complete the ACME challenge flow behind the proxied Cloudflare Load Balancer, leading to TLS/internal-origin errors. Cloudflare Load Balancer/origin health behavior was inconsistent during setup, including unhealthy endpoint detection that did not reflect full readiness and HTTP 400 responses under specific routing paths. Finally, we identified critical environment drift between node 1 and node 2 (including an invalid LOGIN_PAYLOAD_ENCRYPTION_PRIVATE_KEY on node 2, differing REQUIRE_SECURE_API values, and additional secret/config drift in some integrations). This drift caused intermittent behavior depending on which node served a request, including login failures, SSO/manage inconsistencies, and occasional admin errors. Because our prior checks only validated /healthz, a partially broken node could still appear “up” and receive traffic.

Feb 14, 16:57 PST
Identified - The issue has been identified and a fix is being implemented.
Feb 14, 15:51 PST
Update - We are continuing to investigate this issue.
Feb 14, 12:36 PST
Investigating - Goal Get It! is currently experiencing a critical error that is preventing or severely disrupting access to the applicant portal. Our team is actively working to restore service as quickly as possible. During this time, you may be unable to log in, submit materials, or view updates, and you may see errors or timeouts. We’re very sorry for the disruption—your application information remains secure and intact.
Feb 14, 12:30 PST
Feb 13, 2026

No incidents reported.

Feb 12, 2026

No incidents reported.

Feb 11, 2026

No incidents reported.

Feb 10, 2026

No incidents reported.

Feb 9, 2026

No incidents reported.

Feb 8, 2026

No incidents reported.

Feb 7, 2026

No incidents reported.

Feb 6, 2026

No incidents reported.

Feb 5, 2026

No incidents reported.

Feb 4, 2026

No incidents reported.

Feb 3, 2026

No incidents reported.

Feb 2, 2026

No incidents reported.