Cooperation has been widely studied in multi-agent foraging tasks. However, the impact of agent-environment interactions on the longer term and the achievement of sustainability have been largely unexplored in this context. This work contributes to the development of a testbed for exploring social dynamics between agents: the ‘sustainable foraging problem’. This testbed explores the effect of agent behaviour and the agent’s dilemma of choosing between individual reward and collective long-term goals for sustainable resource management. To incorporate varied levels of replenishment rates in this testbed, forest, pasture and desert environment types are formulated. A co-evolving deliberative loop with neuro-evolution that asks the agents to act with greedy or moderate behaviour is demonstrated. This deliberative layer is shown to be insufficient in situations of social dilemma where the agents learn to increase their individual rewards instead of collectively increasing these rewards through the sustainability of the environment. A simple reflective governor based on the notion of the agent’s self-awareness is illustrated to allow the agents to occasionally reason about the long-term impacts of their immediate actions on future resource availability in the environment, which may eventually ensure sustainability.