close
tech

Bytes beat Cao Huanhuan: The better the diversity of recommended content, the greater the user’s long-term retention probability | WISE 2019 Super Evolution Conference

aqmwuckt89dzlyna.jpeg

7月9-10日,36氪在北京和上海同步举办“2019WISE超级进化者”大会,活动设有七大会场,关注企业发展变革路径、行业风向把握、零售行业的进击与蜕变、万亿企业服务市场的崛起、产业创新机会、全球化趋势与差异化需求的爆发逻辑等议题,邀请超百位行业领袖,聚焦那些引领行业变革的超级进化者的崛起之路。有些人认为,算法推荐让用户的兴趣窄化。如果长久使用推荐系统,用户的见识、知识、见解得不到提高。面对这样的质疑,在“2019WISE超级进化者”大会上,字节跳动资深算法架构师曹欢欢回应说:“这是对算法的误解。”他解释道,聪明算法工程师都不希望自己的用户兴趣窄化,就像没有一个商场的经理,希望顾客每一次来到商场都只关注同一类别的商品。商场经理都希望顾客关注尽可能多的产品品类,算法工程师也希望用户尽可能的拓展自己的兴趣。据曹欢欢介绍,行业内一直都在利用推荐系统探索拓展用户兴趣,提升内容多样性。现在主流的技术是用深度学习做推荐。在深度学习里有很多方法,包括网络可以做一些特殊的设置,让它学一些新东西。所有用户、内容都是高维空间的向量,可以有意识引导模型,让它学习一些可能感兴趣的内容。以下为嘉宾演讲实录:今天和大家分享一下算法推荐如何帮助用户去拓展兴趣。很多朋友看到这个话题,觉得有一点意外。因为算法推荐是一个新东西,大规模的应用也就是最近几年。有一个规律,一个新生事物会由于外界很多人对它不够了解,而产生一些误解。对算法推荐来讲有一个常见误解,有些人认为算法推荐让用户的兴趣窄化。这背后的逻辑是,算法推荐很懂你,根据你的兴趣推荐,只推荐用户感兴趣的东西,这个用户看的内容始终在一个有限的范围内。长久以后,用几年推荐系统,你的见识、知识、见解得不到提高。为什么这是一种误解呢?有这种想法,是因为很多人不了解算法工程师,也对算法推荐系统不够了解。聪明算法工程师都不希望自己的用户兴趣窄化,就像没有一个商场的经理,希望顾客每一次来到商场都只关注同一类别的商品。商场经理都希望顾客关注尽可能多的产品品类,算法工程师也希望用户尽可能的拓展自己的兴趣。推荐系统一定是智能、可学习的系统推荐系统本质上一定是基于海量内容的,就是内容一定要多,如果只有十条内容,没有办法推荐。有很多内容,不知道你喜欢哪个,让系统做,这样可以节省用户的精力和时间。从海量内容挑选用户感兴趣的内容,所以推荐系统一定是智能、可学习的系统,并且会根据用户的反馈调整自己。这些反馈有很多,比如在电商领域,是下单、添加到购物车,在内容领域是点击,在短视频领域是播放。推荐系统通过种种正向、负向反馈,不断观察学习,根据这些信号不断调整自己,让自己更能符合用户的兴趣需求,这就是一个推荐系统的本质。业内最早应用推荐系统的行业是电影。早在2006年,当时还在卖DVD的Netflix就曾经发起过一次奖金高达百万的大赛,比赛内容就是说谁能发明比他现有电影推荐算法好10%的方法,就能得到百万美金。推荐系统在资讯领域的应用,其实是比较晚的,今日头条应该是全世界范围内第一个做的。我加入头条比较早,在2014年初。在头条之前,行业里有一些个性化推荐的方案,但是都要基于兴趣订阅。更早的,像谷歌的阅读器,都需要用户进行很烦琐订阅一堆来源或者标签。完全实现系统自动学习推荐,今日头条是全世界第一家。不同行业的推荐系统,虽然应用领域、场景不太一样,但本质是类似的。All recommendation systems rely on three aspects: content characteristics, user characteristics, and environmental characteristics. The system needs to combine these three aspects of information to make decisions. User characteristics refer to the user’s tags, including basic information submitted by the user when registering, such as gender, age, and actions of the user on the platform, such as the list of articles clicked by the user history, the keyword distribution of the article, and the author distribution of the article. And other information. Content features, if it’s a product, it’s important to have categories, tags, and historical purchase reviews. For content, it is its text, theme, keywords and other information. The environmental characteristic is the environmental information. For the user, his interest will change many times, and some will change periodically. For example, a news app user has a change in interest during work and on the way to and from work. These feature information recommendation systems are to be considered. However, the focus of consideration in different areas and different recommendation systems is different. In general, all recommendation systems must be based on these three aspects of information to make decisions. There should be no system, using the characteristic information outside these three aspects, so this summary is more comprehensive. The better the diversity of recommended content, the greater the probability of long-term retention of users. After you understand the basic concepts of the recommendation system, you may think of a question. As a developer of the recommendation system, how to design a recommendation system, what is the goal of the recommendation system? There are different levels of goals from the recommendation system designer and operator, with short-term goals, medium-term goals, and long-term goals. Long-term goals. Operating a business, hoping to improve the long-term loyalty of users, I hope that users can continue to use it after using the headlines and other applications, and become our loyal users. On the one hand, the user experience is very good, and it will be used for a long time. From the perspective of enterprises, long-term income has a guarantee, which is definitely a long-term goal. Long-term goals are very difficult to learn for algorithms and models. The longer the long-term goal, the more difficult the machine learning is. So there are some medium-term goals, such as users coming next week or next month. It is also difficult to improve its stickiness in a short-term window, but there are also some explorations in the industry, such as the learning paradigm of reinforcement learning, without supervising learning. But it is also difficult, and it is not yet very mature. The most mature technology is the short-term goal, and the short-term goal is the user’s feedback to the user in a short time. Pushed an article to the user, whether the user clicked or liked it. The short video of the vibrato push, whether it is played, liked, shared. These short-term goal models are very easy to learn and easy to build user behavior. The relationship between short-term goals and long-term goals is positively correlated, and short-term goals cannot be completely replaced by short-term goals. According to our observations, the better the diversity of recommended content, the greater the probability of long-term retention of users. If you only push up the hot content, the user clicks and clicks for a short time, and it is very cool to watch today. However, the diversity of content is not good, it is very simple, and the long-term retention of users is very poor. This is the same as the analogy of the mall I mentioned above. A user who likes shoes, if you come to the mall to buy your favorite shoes quickly, the user’s single consumption is very happy, but the end user will reduce the number of consumption in this mall, unless he has bought shoes demand. To retain the user for a long time, it is necessary to penetrate his interest, expand his horizons, and let his clothes, food, and movies be completed in the mall. Therefore, from the recommendation system designer, it is very desirable that the recommendation system has good performance for both short-term data and that the content to be promoted is diversified and can satisfy multiple points of interest of users. Even we need to dig more points of interest for users and try to satisfy users with more points of interest on one platform. Therefore, the diversity of content is also our need. The headline is a pioneer in the field of information recommendation, and this piece has accumulated a lot of experience. If you are an old user of the headlines, you should find that the content you see is very diverse. Recommend how the system explores more user interests. Then let’s talk about how we can explore more interesting directions of users in the recommendation system, and avoid content recommendation is too single. From a strategic point of view, the recommendation system will have a weight loss and break-up strategy. All content will be analyzed at various levels of similarity before the recommendation, which will identify which two articles or videos are very similar.比如可能两篇文章,虽然遣词造句不一样,但是讲的内容是一样的。推荐系统能够分析哪些文章讲的是同一个事情,或者涉及到同一个人,或者涉及到某一个公司,基于他们各种各样的特征进行分析。然后,推荐的时候,系统会根据不同的相似性对这些相关的文章进行不同的处理。对于相似的文章,如果给用户推荐了文章A,跟文章A相似的其他文章就不会被推荐了,这个就是消重的策略。那么还有一种情况,就是同一个方向或同一类主题的文章,比如都是足球的文章,推荐系统就需要打散策略,来保证推荐的频率不会太高,避免用户在前端感觉内容的同质化,保证内容多样性,这些都是有算法保证的。除了消重和打散策略,我们还会留一部分比例流量,探索用户的兴趣。甚至,我们会牺牲短期目标,比如每几刷,或有一刷的位置就是探索用户的兴趣,推荐一些模型不确认用户是不是感兴趣,但是模型想探索一下,会有一些这样的流量。然后就是从推荐模型本身来讲,最古老的推荐模型是协同过滤,这是十几年前的老一代技术,那时候大家就已经在考虑拓展用户的兴趣了。因为给用户推一样的东西,用户肯定会流失。协同过滤的做法也非常简单、直观,推荐系统会考虑你跟哪个用户比较像,你们都点了什么,你们喜欢同一类的电影,然后把那个人看过的,但你还没看过的内容推荐给你,通过相似用户的手段实现了兴趣的探索。现在还有很多高端技术也在探索用户兴趣,推动内容多样性。现在主流的技术是用深度学习做推荐,在深度学习里面有很多方法,包括网络可以做一些特殊的设置,让它学一些新东西。深度学习还是比较容易做的,因为所有用户、所有内容都是高维空间的向量,可以有意识引导模型,让它学习一些可能感兴趣的内容,虽然它的兴趣标签和你不一样。但是在这个空间里面,映射到很近的点,让模型容易推出去,这里面有很多高端的做法。最后还有一个重要的手段,我们给用户推出的是一个APP,是一个完整产品。很多人担忧推荐算法,就算你有这么多手段,还是不能很好测量我的兴趣,不能探索出我的更多兴趣。作为产品来讲,它有很多功能,比如今日头条,我们也做了很多功能来帮助大家拓展兴趣。比如热点,包括地震类的重要新闻,以及其他类的小众热点新闻,我们也会直接推荐给大家,我们会做很多精美的专题,背后有很多运营团队去做。比如关注,过去两年,我们在UGC也重点发力,也做得非常好,涵盖了基本上各个行业的名人、大V。比如搜索,现在头条开始尝试做搜索,我们认为,搜索和推荐都是非常重要的信息获取的渠道。大家可以体验一下头条搜索,可以在头条搜全网的内容。我现在经常用头条搜索,反过来让推荐系统也去了解我,给我推荐得更好。另外,头条还有很多频道,鼓励有探索精神的用户去搜索频道,浏览频道,如果觉得系统推荐不够好,或者靠推荐系统本身的探索还比较慢,没有办法满足需求,可以去订阅频道,消费感兴趣的内容,这样会加速系统探索对你感兴趣的速度。。

Tags : Internet entrepreneurshipStart a business互联网创业项目