Humans are uniquely capable social learners. Our capacity to learn from others across short and long timescales is a driving force behind the success of our species. Yet there are seemingly maladaptive patterns of human social learning, characterized by both overreliance and underreliance on social information. Recent advances in animal research have incorporated rich visual and spatial dynamics to study social learning in ecological contexts, showing how simple mechanisms can give rise to intelligent group dynamics. However, similar techniques have yet to be translated into human research, which additionally requires integrating the sophistication of human individual and social learning mechanisms. Thus, it is still largely unknown how humans dynamically adapt social learning strategies to different environments and how group dynamics emerge under realistic conditions. Here, we use a collective foraging experiment in an immersive Minecraft environment to provide unique insights into how visual-spatial interactions give rise to adaptive, specialized, and selective social learning. Our analyses show how groups adapt to the demands of the environment through specialization of learning strategies rather than homogeneity and through the adaptive deployment of selective imitation rather than indiscriminate copying. We test these mechanisms using computational modeling, providing a deeper understanding of the cognitive mechanisms that dynamically influence social decision-making in ecological contexts. All results are compared against an asocial baseline, allowing us to specify specialization and selective attention as uniquely social phenomena, which provide the adaptive foundations of human social learning.