最具影响力的数字化技术在线社区

168大数据

 找回密码
 立即注册

QQ登录

只需一步,快速开始

1 2 3 4 5
打印 上一主题 下一主题
开启左侧

Cloudera公司首席架构师谈Hadoop之变迁

[复制链接]
跳转到指定楼层
楼主
发表于 2014-10-30 13:16:32 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式

马上注册,结交更多数据大咖,获取更多知识干货,轻松玩转大数据

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
在这一次的采访中,Cloudera公司首席架构师Doug Cutting向我们解释了开源开发机制为何更加强调技术常识而非开发信念,同时深度剖析了开源机制在企业环境下的应用方式。
Doug Cutting是众多获得巨大成功的开源项目的创始人,其中包括Lucene以及hadoop这样的重量级成果。目前他在Cloudera公司担任首席架构师一职,同时也在Apache软件基金会董事会任职。
在这一次的采访中,他向我们解释了开源开发机制为何更加强调技术常识而非开发信念,同时深度剖析了开源机制在企业环境下的应用方式。此前他曾在All Things Open大会上作出过主题演讲,因此我也向他问起Lucene的开源开发之路、他个人在Apache软件基金会中所扮演的角色以及开源机制对他而言意味着什么。
您曾经在GPL许可之下在SourceForge上发布Lucene,早在2000年时就对Lucene进行开源处理一定面临着诸多不同于当下的问题吧?
其实当时的状况与现在相比并没有太多差别。学术界与研究界的从业者们早就开始了软件开发成果的分享之旅,因此免费下载技术方案的概念或者开源许可并不算是什么新鲜事物。(我与GPL的首次邂逅是在1985年,当时我在这套许可之下为GNU Emacs贡献了一部分代码。)要说差别,当时使用的工具与当下有所不同。我们那时候使用的是Concurrent Versions System (即并发版本系统,简称CVS),因为当时还不存在版本控制这类可用工具。我们并没有使用错误追踪机制,只是单纯通过邮件列表来处理沟通工作,不过其基本流程还是一样的。人们利用它来交流并协调自己在共享项目中的工作成果。
自从您最初创造的首个项目——Lucene——以来,您就一直将开源作为开发工作的基本原则。您当下仍在坚持这些原则吗,理由又是什么呢?
对我来说,开源开发机制的重点在于常识的积累而非对开放信念的强调。我希望自己的努力能够为用户带来切实可行的软件解决方案,也就是将实用性作为首要诉求。我喜欢与其他同伴一起完成这项任务。在这些基本前提确定下来之后,其它事情也就水到渠成了。我们必须要以敬意作为前提同其他参与者协同合作,否则根本不可能获得理想的协作成果。同样,要想构建起一个能够健康运作而且拥有长期协作关系的开源社区,透明度与精英管理体系也是不可或缺的。从这个角度来看,开源开发与非软件项目其实没什么不同。就像在组织聚会之后的清理分工一样,有些同志负责擦洗桌面、有些负责清洁碗筷、另一些则负责将椅子摆回原位。在这里我们并非上下级的关系,每个人都从属于自己有能力完成的那部分工作体系——换言之,既要把房子打扫干净、又不能因此破坏了彼此之间的朋友关系。
您是Apache软件基金会的董事会成员之一。您能从这个角度讲讲自己所扮演的角色吗?
从根本层面讲,Apache董事会的作用在于监督基金会旗下的各个项目,从而确保其各自拥有一套健康有序的社区体系。我们需要保证这些项目的实质性控制权不会落到某个个人或者公司手中,而是真正让每位参与者都能获得应有的尊重。目前的150多个Apache项目会定期向董事会提交季度报告,这意味着我们每个月大约需要审查50个项目的运作状态。一般来讲这项工作都能顺利进行。当然,我们偶尔也需要介入其中,为项目指明一个更为可行的发展方向。董事会还负责处理各种典型的组织管理工作,例如确保有人维持网站的正常运行、收集捐赠款项并及时纳税等等。
随着越来越多企业开始在运营环境下使用开源方案,您认为未来三到五年内Hadoop与开源将分别呈现出怎样的发展态势?
我非常欣赏开源机制,因为它适合我个人作为开发人员的身份。它能让很多用户使用我所打造的软件成果,这是一种非常宝贵的个人奖励与工作肯定。此外,开源对于普通软件用户而言也颇具吸引力,因为他们能够借此大大降低对于特定厂商的依赖性(也就是‘供应商锁定’)。现在已经有越来越多开发人员专注于为专有技术方案创造替代式开源成果。如果可以选择,用户更倾向于使用开源方案,因为这能够有效摆脱锁定效应的负面影响。事实上,开源实施方案算是开了个好头,而Hadoop生态系统则继续跟进并完成接下来的深层工作。大家可能注意到了,开发人员往往会以当前专有方案为基础开发出替代性开源成果,但却很少有人打算利用专有产品代替人们所喜爱的开源工具。我希望这种趋势能够一直保持下去。Hadoop生态系统的核心经历了诸多发展与变化,但其仍将坚持开源路线不动摇。虽然目前已经有一些专有工具出现在这套堆栈之上,但从基础层面看Hadoop的开源身份仍然可谓根红苗正。
待办事务团队的建立给您带来了怎样的帮助?
我会与他们进行简单交流,而且在我看来整个团队就是一份邮件列表——只不过这部分成员的主要工作在于运行企业开源项目并探讨与此相关的最佳实践。基本上就是这些,他们的全部议程都以此为核心。许多企业都会发布一些开源成果并因此面临常见的技术以及法务问题。他们希望在这方面找到可资合作的机会,或者至少给予劝解。
英语原文:
Chief Architect of Cloudera on growth of Hadoop
Doug Cutting is founder of numerous successful open source projects, including Lucene and Hadoop, and currently the chief architect at Cloudera and sits on the Board of the Apache Software Foundation.
In this interview, he tells me how working on open source is more about common sense than creed and dives into open source adoption in the enterprise. Prior to his keynote at the All Things Open conference, I asked him about open sourcing Lucene, what his role is like on the board of the Apache Software Foundation, and what the open source way means to him.
What was it like to open source Lucene back in 2000 when you released it on SourceForge under the GPL license?
It wasn’t that different than things are today. Folks had shared software for a long time in the academic and research communities, so the concept of downloading free stuff wasn’t new, nor were open source licenses. (I first ran into the GPL in 1985 when I contributed some code to GNU Emacs.) The tools were different. We used Concurrent Versions System (CVS), since even subversion wasn’t yet available. We didn’t use a bugtracker, just the mailing list, but the fundamental process is much the same. People communicate to coordinate their work on a shared project.
Since the first project you founded, Lucene, you have followed the open source way principles. Do you still apply them today, and why?
To me it’s more common sense than following any particular creed. I want to help create software that people use, that’s useful. I like to do this together with other people. The rest follows naturally. One must treat collaborators with respect or they won’t want to collaborate. Similarly, transparency and meritocracy are required to build healthy, long-lived collaborative communities. At this level it’s not much different than non-software projects. If you’re cleaning up after a party then some folks need to clear the table, some wash dishes, some put chairs away, etc. No one is the boss, everyone just pitches in where they can to achieve the group’s goal, which is both to get the house clean and to remain friends.
You are a member of the Apache Software Foundation board. Can you tell is a bit about your role?
Mostly the Apache Board monitors all the projects in the foundation to make sure each has a healthy community. We need to ensure that projects aren’t controlled by one person or company, that everyone is acting respectfully, etc. Each of the 150+ Apache projects submits a quarterly report to the board, so we review about 50 projects at each monthly meeting. Most run smoothly. Occasionally we have to give a project a nudge in the right direction. The board also deals with the typical administrivia, like making sure someone keeps the website running, collecting donations, filing taxes, etc.
With the increasing adoption of open source in enterprise, where do you see both open source and Hadoop in 3 to 5 years?
I gravitated to open source because it suited me as a developer. It lets lots of folks use software I work on, which is personally rewarding. But it also is very attractive to users of software, since they can be less dependent on other businesses (“locked in”). More and more developers are creating open source alternatives to proprietary technologies. Given the choice, users prefer an open source implementation for it’s lack of lock in. The Hadoop ecosystem has taken the next step, where the open source implementations came first. Few are motivated to create proprietary alternatives since folks would likely prefer the open source versions. I expect this pattern to continue for many years. The core components of the Hadoop ecosystem will remain open source, even as the core grows and mutates. Some proprietary tools survive at the top of the stack, but few will at the base.
What is your take on the formation of the TODO group?
I spoke briefly with them, and I think it’s just a mailing list for folks running corporate open source projects to talk about best practices. They don’t seem to have more of an agenda than that. Lots of companies publish something open source and have common technical and legal issues. They’d like to collaborate on approaches, or at least commiserate.
via:核子可乐译 51CTO


楼主热帖
分享到:  QQ好友和群QQ好友和群 QQ空间QQ空间 腾讯微博腾讯微博 腾讯朋友腾讯朋友
收藏收藏 转播转播 分享分享 分享淘帖 赞 踩

168大数据 - 论坛版权1.本主题所有言论和图片纯属网友个人见解,与本站立场无关
2.本站所有主题由网友自行投稿发布。若为首发或独家,该帖子作者与168大数据享有帖子相关版权。
3.其他单位或个人使用、转载或引用本文时必须同时征得该帖子作者和168大数据的同意,并添加本文出处。
4.本站所收集的部分公开资料来源于网络,转载目的在于传递价值及用于交流学习,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。
5.任何通过此网页连接而得到的资讯、产品及服务,本站概不负责,亦不负任何法律责任。
6.本站遵循行业规范,任何转载的稿件都会明确标注作者和来源,若标注有误或遗漏而侵犯到任何版权问题,请尽快告知,本站将及时删除。
7.168大数据管理员和版主有权不事先通知发贴者而删除本文。

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

关闭

站长推荐上一条 /1 下一条

关于我们|小黑屋|Archiver|168大数据 ( 京ICP备14035423号|申请友情链接

GMT+8, 2024-5-16 16:59

Powered by BI168大数据社区

© 2012-2014 168大数据

快速回复 返回顶部 返回列表