As you have noticed, I don’t post very often, so I am gratified that so many people have subscribed. I do make an effort to keep the usernames secure, encrypted, and I will never sell them. My limitation is I depend upon my provider to keep their servers secure. So far they have proven themselves competent and secure. I use multi-factor authentication to administer the site.
Too bad the rest of the world doesn’t even take these minimal measures. Just recently my personal ISP scanned for my personal email addresses “on the dark web”. To my pleasant surprise, they did a thorough job, but to my horrific shock, they found my old email addresses and cleartext passwords. I was really surprised that my ISP provided me with the links to the password lists on the dark web. I was able to download them, which were files of thousands of emails and cleartext passwords from compromised web sites. I destroyed my copies of the files so no one could accuse me of hacking those accounts. I was lucky my compromised accounts were ones I no longer used and I could just safely delete the accounts. In short order, my ISP had delivered three shocks to me:
- My ISP delivered lists of usernames and passwords of other people to me.
- The passwords were stored in cleartext.
- Supposedly reputable websites did not have sufficient security to prevent someone from downloading the password files from the various website’s admin areas.
I guess that last item shouldn’t be a surprise because in #2 the websites actually stored the unencrypted password. Perhaps this wouldn’t bother me so much if the principles for secure coding were complicated or hard to implement.
If you think security is complicated, you’re not to blame. The book on the 13 Deadly of Sins of Software Security became the 19 Dead Sins in later editions, and now the book is up to the 24 Deadly Sins. An entire industry exists to scare you into hiring consulting services and buying their books. Secure software, though, isn’t that complicated, but it has a lot of details.
Let’s start with your application accepting passwords. First rule, which everyone seems to get, is don’t echo the password when the user enters it. From the the command line use getpass()
or readpassphrase()
. Most GUI frameworks offer widgets for entering passwords that don’t echo the user’s input.
Next don’t allow the user to overrun your input buffers — more on that later. Finally, never store the password in an unencrypted form. This is the part where the various websites that exposed my username and passwords utterly failed. You never need to store the password — instead hash the password and store the hash. When you enter a password, the server, or the client (transmits the hash via an encrypted channel like TLS), hashes the password and the server compares it with its saved hashed password for your account. This is why your admin can’t ever tell you your own password because they can’t reverse the hash.
This is an example of the devil is in the details, where the security isn’t complicated, just detailed. The concept of password hashing is decades old. The user enters their password, and the system hashes it immediately, and compares the hash with what it has stored. If someone steals the system’s password file, they would need to to generate passwords that happen to hash to the same values in the password file.
Simple in concept, but the details will get you. Early Unix systems used simple XOR style hashing, so it was easy to create passwords that hashed to the same values, or even reproduce the original password. Modern systems use a cryptographic hash such a SHA2-512. Even with a cryptographic hash, though, you can get a collision of two different users who happen to use the same password. Modern systems add a salt value to your password. That salt value is usually a unique number stored with your username, so on most systems you need to steal both the password file and the file of salt values. Of course, if someone does break into your system, you’ll have the wisdom to set the permissions on the password and salt files so only the application owner can even see them and read them.
In short,
- Don’t echo sensitive information
- Don’t bother storing the unencrypted password
- Protect the hashed passwords.
We’re straying into systems administration and devops, so let’s get back to coding.
All of the deadly sins have fundamental common roots:
Do not execute data.
When you read something from the outside world, whether from a file, stream, or socket, don’t execute it. When you accept input from the outside world, think before you use it. Don’t allow buffer overruns. Do not embed input directly into a command without first escaping it or binding it to a named parameter. We all know the joke:
A good way to avoid executing data, is
“Do not trespass” means don’t refer to memory you may not own. Don’t overrun your array boundaries, don’t de-reference freed memory pointers, and pay attention to the number of arguments you pass into functions and methods. A common way of breaking into a system is overrunning an input buffer located in local memory until it overruns the stack frame. The data getting pushed into the buffer would be executable code. When the overrun overlaps the return pointer of the function, it substitutes an address in the overrun to get the CPU to transfer control to the payload in the buffer. A lot of runtime code is open source, so it just takes inspection to find the areas of code to exploit this type of vulnerability. Modern computer CPUs and operating systems often place executable code in read-only areas to protect against accidental (or malicious) overwrites, and may even mark data areas as no-execute — but you can’t depend upon those features existing. Scan the database of known vulnerabilities at https://cve.mitre.org/cve/ to see if your system needs to be patched. Write your own code so it is not subject to this vulnerability.
Buffer overruns are perhaps the most famous of the data trespasses.
With C++ it is easy to avoid data trespasses. C++ functions and methods are strongly typed so if you attempt to pass the wrong number of arguments, it won’t even compile. This avoids a common C error of passing an inadequate number of arguments to a function so the function accesses random memory for missing arguments.
Despite its strong typing C++ requires care to avoid container boundary violations. std::vector::operator[]
does not produce an exception when used to access beyond the end of a vector, nor does it extend vector when you write beyond the end of the vector. std::vector::at()
does produce exceptions on out of range accesses. Adding the end of the array with std::vector::push_back() may proceed until memory is exhausted or an implementation defined limit is reached. I’m going to reserve memory management for another day. In the meantime, here is some example code demonstrating the behavior of std::vector
:
// -*- mode: c++ -*-
////
// @copyright 2022 Glen S. Dayton. Permission granted to copy this code as long as this notice is included.
// Demonstrate accessing beyond the end of a vector
#include <algorithm>
#include <cstdlib>
#include <iostream>
#include <iterator>
#include <stdexcept>
#include <vector>
using namespace std;
int main(int /*argc*/, char* argv[]) {
int returnCode = EXIT_FAILURE;
try {
vector< int> sample( 10, 42 );
std::copy( sample.begin(), sample.end(), std::ostream_iterator< int>(cout, ","));
cout << endl;
cout << "Length " << sample.size() << endl;
cout << sample[12] << endl;
cout << sample.at( 12 ) << endl;
cout << "Length " << sample.size() << endl;
cout << sample.at( 12 ) << endl;
returnCode = EXIT_SUCCESS;
} catch (const exception& ex) {
cerr << argv[0] << ": Exception: " << typeid(ex).name() << " " << ex.what() << endl;
}
return returnCode;
}
And its output:
42,42,42,42,42,42,42,42,42,42, Length 10 0 /Users/gdayton19/Projects/containerexample/Debug/containerexample: Exception: St12out_of_range vector
C++ does not make it easy to limit the amount of input your program can accept into a string. The The stream extraction operator, >>,
does pay attention to a field width set with the stream’s width()
method, or the setw
manipulator — but stops accepting on whitespace. You must use a getline() of some sort to get a string with spaces, or use C++17’s quoted string facility. Here’s an example of the extraction operator>>
:
// -*- mode: c++ -*-
#include <cstdlib>
#include <iomanip>
#include <iostream>
#include <limits>
#include <stdexcept>
#include <string>
using namespace std;
int main(int /*argc*/, char* argv[]) {
int returnCode = EXIT_FAILURE;
constexpr auto MAXINPUTLIMIT = 40U;
try {
string someData;
cout << "String insertion operator input? " << flush;
cin >> setw(MAXINPUTLIMIT) >> someData;
cout << endl << " This is what was read in: " << endl;
cout << quoted(someData) << endl;
// Discard the rest of line
cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
cout << "Try it again with quotes: " << flush;
cin >> setw(MAXINPUTLIMIT) >> quoted(someData);
cout << endl;
cout << " Quoted string read in: " << endl;
cout << quoted(someData) << endl;
cout << "Unquoted: " << someData << endl;
cout << "Length of string read in: " << someData.size() << endl;
returnCode = EXIT_SUCCESS;
} catch (const exception& ex) {
cerr << argv[0] << ": Exception: " << ex.what() << endl;
}
return returnCode;
}
And a some sample output from it:
String insertion operator input? The quick brown fox jumped over the lazy dog. This is what was read in: "The" Try it again with quotes: "The quick brown fox jumped over thge lazy dog." Quoted string read in: "The quick brown fox jumped over thge lazy dog." Unquoted: The quick brown fox jumped over thge lazy dog. Length of string read in: 46
The quoted()
manipulator ignores the field width limit on input.
You need to use getline()
to read complete unquoted strings with spaces. The getline()
used with std::string
, though, ignores the field width. Here is some example code using getline()
:
// -*- mode: c++ -*-
#include <cstdlib>
#include <iomanip>
#include <iostream>
#include <stdexcept>
#include <string>
using namespace std;
int main(int /*argc*/, char* argv[]) {
int returnCode = EXIT_FAILURE;
constexpr auto MAXINPUTLIMIT = 10U;
try {
string someData;
cout << "String getline input? " << flush;
cin.width(MAXINPUTLIMIT); // This version of getline() ignores width.
getline(cin, someData);
cout << endl << " This is what was read in: " << endl;
cout << quoted(someData) << endl;
returnCode = EXIT_SUCCESS;
} catch (const exception& ex) {
cerr << argv[0] << ": Exception: " << ex.what() << endl;
}
return returnCode;
}
And a sample run of the above code:
String getline input? The rain in Spain falls mainly on the plain. This is what was read in: "The rain in Spain falls mainly on the plain."
Notice the complete sentence was read in even though the field width was set to only 10 characters.
To limit the amount of input, we must resort to the std::istream::getline()
:
// -*- mode: c++ -*-
#include <cstdlib>
#include <cstring>
#include <iomanip>
#include <iostream>
#include <stdexcept>
#include <string>
using namespace std;
int main(int /*argc*/, char* argv[]) {
int returnCode = EXIT_FAILURE;
constexpr auto MAXINPUTLIMIT = 10U;
char buffer[MAXINPUTLIMIT+1];
memset(buffer, 0, sizeof(buffer));
try {
cout << "String getline input? " << flush;
cin.getline(buffer, sizeof(buffer));
cout << endl << " This is what was read in: " << endl;
cout << "\"" << buffer<< "\"" << endl;
returnCode = EXIT_SUCCESS;
} catch (const exception& ex) {
cerr << argv[0] << ": Exception: " << ex.what() << endl;
}
return returnCode;
}
And its sample use:
String getline input? I have met the enemy and thems is us. This is what was read in: "I have met"
Notice the code only asks for 10 characters and it only gets 10 characters. I used a plain old C char array rather than a fancier C++ std::array<char, 10>
because char
doesn’t have a constructor, so its values of the array thus constructed are indeterminant. An easy way to make sure a C style string is null terminated is to fill it with 0 with a memset()
. Of course, you could fill the array entirely with fill()
from <algorithm>
, but sometimes the more direct method is lighter, faster, and more secure.