西西河

主题:【文摘】youtube architecture -- 西电鲁丁

共:💬7 🌺15 新:
分页树展主题 · 全看
  • 家园 【文摘】youtube architecture

    好文章不敢独享,贴出来供大家参考。

    http://highscalability.com/youtube-architecture

    文章介绍了youtube网站的软件平台和主要的技术架构,重点是如何实现存储海量的视频和服务每日超过1亿的视频访问的。

    摘要:

    1. YouTube的软件平台全是开源的

    # Apache

    # Python

    # Linux (SuSe)

    # MySQL

    # psyco, a dynamic python->C compiler

    # lighttpd for video instead of Apache

    2. Most popular content is moved to a CDN (content delivery network):

    - CDNs replicate content in multiple places. There's a better chance of content being closer to the user, with fewer hops, and content will run over a more friendly network.

    - CDN machines mostly serve out of memory because the content is so popular there's little thrashing of content into and out of memory.

    # Less popular content (1-20 views per day) uses YouTube servers in various colo sites.

    -Caching doesn't do a lot of good in this scenario, so spending money on more cache may not make sense. This is a very interesting point

    3. Thumbnails (4 thumbnails for each video so there are a lot more thumbnails than videos)最初存放于LINUX EXT3的文件系统,已经到了目录下文件数的上限,决定采用Google's BigTable(记得河里的邓兄介绍过,不记得坑填完了没有)

    4. MYSQL数据库存储META DATA(元数据),包括TAGS,描述等 。(其实大部分的ECM-企业内容管理系统也是采用数据库管理元数据,文件系统或数据库存储内容) Can now scale database almost arbitrarily.

    令人惊讶的是YouTube的技术队伍只有9人,

    2 sysadmins, 2 scalability software architects

    2 feature developers, 2 network engineers, 1 DBA

    这个应该会令很多网站汗颜。

分页树展主题 · 全看


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河