Firstly what is a big site? Is it 10.000 pages or 100.000 or even 100.000? My experience is that sites that more than 10 .000 to 15.000 pages start to show the same symptoms, the main problem in most cases is the WCMS (web content management system), how it is used to manage the sites content.
This can be how templates are designed to show and possibly reuse content. Little problems within sites that contain millions of pages can occur again and again and grow from being a minor glitch in being a huge disaster that prevents search visibility.So planning ahead can be really helpful when managing a large site. In most cases Google problem is either too much of the same content or to little access to it. Google as the other crawlers should only be allowed to index valuable pages with legitimate none duplicate content.
So what does that mean? Well if we look at the “too much” of the content problem, better known as duplicate content, this is in most cases because it is being reused again and again and again and …. This leads to Google either demoting the site, parts of it or even taking it out completely.
To find if there is a problem it´s in all cases good to put Google Webmaster tools on the site and Yahoo Site Explorer. When the site has been verified you will start to see all kinds of data such as crawler errors, what keywords the site us found under but not getting clicks through and more.
Although very basic this should get you going in identifying crawler issues that will eventually lead to visibility issues on the search engines.