An article in L'Express caught my attention today. The article mentioned a rise in phishing attempts towards customers of a particular bank.

From the comments on the page, I noticed several people who shared their concerns regarding web security. On the same page it appeared like a bank employee was answering comments/questions & said the following:

lexpress-mcb-phishing-comment

Now, that's a bold statement & strong commitment coming from the bank, assuming the person is really a bank employee. Let's see why.

The comment says: Nous vous rassurons qu'il n'y a pas de fuite de données à la MCB

« Pas de fuite » meaning no (email) leakage happened on the bank's side. Wait & watch.

lexpress-mcb-phishing-google

The above file is publicly available on the Internet. Just that one file contains over a hundred email addresses, among which are Government officers & business people. The email addresses are a mix of @mail.gov.mu, @yahoo.com, @gmail.com, @hotmail.com, @intnet.mu, @orange.mu and several other business domains.

A quick pdfgrep will extract all the email addresses from the file in no less than 1 minute.
Coming back to the statement that bank didn't leak any data, does it still stand true? Now, this is just one quick method at extracting information. There are several other methods through which a customized web crawler could be used to penetrate further deep & do effective data mining. Well, that's not the aim of the article, I will full-stop here.
Above screenshots taken from L'Express website & Google

Update - 14 November 2014, 23h40

In response to Shane's comment below, I am writing this. I agree with you Shane, that the email addresses were published during a marketing campaign supporting the companies mentioned in the "brochure". However, my point is the way the information is leaving MCB's website is incorrect & I'd call it a data leak. See, the email addresses can be seen & read by search engines. The brochure was made to support the companies & not make them spam/phishing targets.

Jochen mentioned about robots.txt. The file resides on the following location: http://www.mcb.mu/fr/media/annuaire-de-lenvironnement_tcm10-4124.pdf. Is it wise to let search engines dig & index everything there? I don't think so.

Spammers using data mining tools searching & digging websites for email addresses could easily pull out the email addresses since they aren't protected. Usually, on web pages one would break email addresses using special characters to make it hard to detect by data mining tools. If email addresses or other personal information are saved in files then direct access to the directories & web indexing should be avoided to prevent malicious data capture.