A framework for understanding an open scientific community using automated harvesting of public artifacts