gitdataai/libs/api/robots.rs
ZhenYi d593354ba9
Some checks are pending
CI / Rust Lint & Check (push) Waiting to run
CI / Rust Tests (push) Waiting to run
CI / Frontend Lint & Type Check (push) Waiting to run
CI / Frontend Build (push) Blocked by required conditions
feat: add sitemap index with static/users/projects/repos sub-sitemaps
- Main sitemap index at /sitemap.xml referencing 4 sub-sitemaps
- /sidemap/static: fixed routes (homepage, auth, marketing pages)
- /sidemap/users: public user profiles sorted alphabetically
- /sidemap/projects: public projects sorted alphabetically
- /sidemap/repos: public repos sorted alphabetically
- Redis cache with 8h TTL (no refresh on access), key: sidemap:{type}
- robots.txt Sitemap URL uses main_domain() with https:// forced
- All sitemap loc entries use https:// base URL
2026-04-26 00:06:18 +08:00

38 lines
923 B
Rust

use actix_web::{web, HttpResponse};
use service::AppService;
/// Serves robots.txt, blocking all sensitive paths from crawlers.
pub async fn robots(service: web::Data<AppService>) -> HttpResponse {
let raw = service
.config
.main_domain()
.unwrap_or_else(|_| "https://gitdata.ai".to_string());
let sitemap_base = if raw.starts_with("https://") {
raw.trim_end_matches('/').to_string()
} else if raw.starts_with("http://") {
raw.replacen("http://", "https://", 1)
} else {
format!("https://{raw}")
};
let body = format!(
r#"User-agent: *
Disallow: /api/
Disallow: /health
Disallow: /metrics
Disallow: /ws/
Disallow: /avatar/
Disallow: /blob/
Disallow: /media/
Disallow: /static/
Disallow: /assets/
Sitemap: {sitemap_base}/sitemap.xml
"#,
);
HttpResponse::Ok()
.content_type("text/plain; charset=utf-8")
.body(body)
}