
「Titanic: Machine Learning from Disaster-Information-Frequently Asked Questions」翻訳してみた(Kaggleのタイタニック号沈没生存グループ予測チュートリアル)

Titanic: Machine Learning from Disaster

Frequently Asked Questions

 「非公開のleaderboard」があるのに気がつくでしょう。過学習(訳注:overfitting(オーバフィッティング)。ある設問に対する解を確実に出せることに特化したモデルを作ってしまった結果、他の設問に対応できなくなる)を防ぐためにあります。作ったモデルが一般的ではなく、特定のデータセットに適合しすぎていることがあります。一見するとよさそうで、leaderboardでも良いスコアを示しますが、他のデータセットにモデルを適用するのは難しいです。少量のテストデータに対する予測精度を上げることを意図して行ったとしても、最終結果は見ることができない残りのテストデータで行われます。この場合、船客419人分のテストデータがあり、210名分のスコアを見ることができますが、最終スコアは残りの209名分で行われます。その209名分のデータに関しては、コンペの終了まで見ることができません。 公開されたスコアと非公開のスコアは異なり、最後には、非公開のスコアのほうが順位としてつかわれます。
チームにメンバーを追加する場合、ダッシュボードの「My Team」タブをクリックしてください。友達にリクエストを送るボタンが2つあります。1つめのボタンは、まだコンペに参加していない、あるいはKaggleにもまだ参加していない人へリクエストを送るものです。2つめのボタンは、タイタニックコンペにすでに参加しているチーム(leaderboardやフォーラムで意気投合したチーム)を合併するものです。友達を作ってこんぺを勝ちにいきましょう。
さらに一般的なFAQを確認したい場合、「Member FAQ wiki page(https://www.kaggle.com/wiki/KaggleMemberFAQ)」を確認してください。


What is the difference between the private and public leaderboard?
You will notice there is a 'Private Leaderboard'.  This is used to prevent overfitting, whereby one makes a model that is not very general, but rather too finely tuned to a particular data set. It is tempting, but not wise, to tweak your model so that it gets a really good score on the leaderboard, at the expense of being useful for other data sets. We intentionally show how well your predictions do on just a fraction of the test data -- but the final standings are scored on the rest of test data, which you can't see. So in this case: of the 419 test passengers, you will see your score for 210 of them; however your final score (which you can't see until the close of the competition) will be on the other 209. Your public standing may be different to your private standing -- and in the end, it is your private score that counts!
Use the Forum before using Support?
We get a lot of questions here at Kaggle from participants that concern a variety of issues. If a problem seems generic, please post it on a forum for all to see, and the competition host, an expert or fellow participant, or a Kaggle employee will reply with an answer to help. These forums are extremely useful, as you may often find something strange/new about the data or the metric which you want to share. If you share your knowledge you will get a lot back!
If your problem persists or it's something that can't be solved with outside help, then please contact us.
Teams and Team Mergers
Like any project or assignment, cooperation is the most helpful way to learn. This section should come with a warning that some competitions don't allow teams larger than 1. However in those that allow bigger teams, we highly recommend it. You are probably coming to Kaggle with a particular knowledge in some field. This may get you to top 50. However there will be techniques and information of which you are not aware that could push you into the top 20.
In order to add members to your team, click the 'My Team' tab in the dashboard. You'll see two separate invite boxes with a button to Send the request to your friend. The first Invite box is meant for someone who has not joined the competition already (or even joined Kaggle yet). The second Invite box is for extending an invitation to merge with an existing team in the TItanic competition (such as someone you see on the leaderboard or interact with in the forum). Go forth, make friends, and get competing!!
For more common questions see the Member FAQ wiki page 


