A Note from the Instructor on Scanf

A Note from the Instructor on Scanf

CptS 121, Summer 2011, Section 3GokcenCilingir - June 23, 2011

A note from the instructor on scanf

If you want to learn more on scanf and get some explanations on the time-to-time “weird” behavior of scanf (some of which we have encountered in today’s lecture), you can read the below reference and my notes on the issue. This may be a rather advanced discussion at this time of the class, so don’t feel bad if some parts are confusing. Let me know if something is unclear.

Scanf Reference

(

Reads data from stdinand stores them according to the parameter format into the locations pointed by the additional arguments. The additional arguments should point to already allocated objects of the type specified by their corresponding format tag within the format string.

Parameters

format

C string that contains one or more of the following items:

  • Whitespace character: the function will read and ignore any whitespace characters (this includes blank spaces and the newline and tab characters) which are encountered before the next non-whitespace character. This includes any quantity of whitespace characters, or none.
  • Non-whitespace character, except percentage signs (%): Any character that is not either a whitespace character (blank, newline or tab) or part of a format specifier (which begin with a % character) causes the function to read the next character from stdin, compare it to this non-whitespace character and if it matches, it is discarded and the function continues with the next character of format. If the character does not match, the function fails, returning and leaving subsequent characters of stdin unread.
  • Format specifiers: A sequence formed by an initial percentage sign (%) indicates a format specifier, which is used to specify the type and format of the data to be retrieved from stdin and stored in the locations addressed by the additional arguments. A format specifier follows this prototype:
    %[*][width]type
    where:

* / An optional starting asterisk indicates that the data is to be retrieved from stdin but ignored, i.e. it is not stored in the corresponding argument.
width / Specifies the maximum number of characters to be read in the current reading operation

Notes:

  • Using a whitespace in the format string, we’re telling scanf function that: “At this point in your reading from the input buffer, I’m letting you skip any (zero or more) whitespaces you may encounter”. Using a whitespace in anywhere but as the last character in the format string is OK. When you put a whitespace as the last character of the format string, until you enter a non-whitespace character, scanf will keep skipping them, jamming your program execution.
  • Using whitespaces in the format string makes no difference in the behavior of the scanf function when you’re using numeric placeholders (like %d and %lf). These place holders cause scanf to skip over preceding whitespaces in the input buffer. So, skipping over whitespaces is a part of the process of matching a numeric placeholder. Whether you explicitly say so in the format string or not, preceding whitespaces will be skipped and cleared out from the input buffer (only when you’re using numeric placeholder).
  • When scanf tries to read a numeric value from the input (this happens when the format string contains numeric placeholder), it keeps reading until something that is not expected to be a part of the data type specified by the placeholder is encountered. It does not remove whatever character encountered from the input buffer that stopped the matching. This explains what happens with the code given in last slide of the third lecture note.
  • The %c placeholder matches with a character and there is no “skipping whitespaces” involved in the matching process, which makes all the sense in the world since whitespaces are still characters and banishing them to be matched with a character placeholder would be discriminatory. Anyway if you want preceding/leading whitespaces to be skipped around a character, just place a whitespace before/after the place holder in the format string respectively.