TY - GEN
T1 - Learning cross-modal embeddings for cooking recipes and food images
AU - Salvador, Amaia
AU - Hynes, Nicholas
AU - Aytar, Yusuf
AU - Marin, Javier
AU - Ofli, Ferda
AU - Weber, Ingmar
AU - Torralba, Antonio
N1 - Publisher Copyright:
©2017 IEEE.
PY - 2017/11/6
Y1 - 2017/11/6
N2 - In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to find a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Additionally, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general. Code, data and models are publicly available.
AB - In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to find a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Additionally, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general. Code, data and models are publicly available.
UR - http://www.scopus.com/inward/record.url?scp=85038000096&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2017.327
DO - 10.1109/CVPR.2017.327
M3 - Conference contribution
AN - SCOPUS:85038000096
T3 - Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
SP - 3068
EP - 3076
BT - Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017
Y2 - 21 July 2017 through 26 July 2017
ER -